Data Manipulation with dplyr in R

select

select(data,变量名)

The filter and arrange verbs

arrange

counties_selected <- counties %>%
select(state, county, population, private_work, public_work, self_employed) # Add a verb to sort in descending order of public_work
counties_selected %>%arrange(desc(public_work))

filter

counties_selected <- counties %>%
select(state, county, population) # Filter for counties in the state of California that have a population above 1000000
counties_selected %>%
filter(state == "California",
population > 1000000)
#筛选多个变量
filter(id %in% c("a","b","c"...)) 存在
filter(id %in% c("a","b","c"...)) 不存在

fct_relevel {forcats}

Reorder factor levels by hand

排序,order不好使的时候

f <- factor(c("a", "b", "c", "d"), levels = c("b", "c", "d", "a"))
fct_relevel(f)
fct_relevel(f, "a")
fct_relevel(f, "b", "a") # Move to the third position
fct_relevel(f, "a", after = 2) # Relevel to the end
fct_relevel(f, "a", after = Inf)
fct_relevel(f, "a", after = 3) # Revel with a function
fct_relevel(f, sort)
fct_relevel(f, sample)
fct_relevel(f, rev)

Filtering and arranging

 counties_selected <- counties %>%
select(state, county, population, private_work, public_work, self_employed)
>
> # Filter for Texas and more than 10000 people; sort in descending order of private_work
> counties_selected %>%filter(state=='Texas',population>10000)%>%arrange(desc(private_work))
# A tibble: 169 x 6
state county population private_work public_work self_employed
<chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 Texas Gregg 123178 84.7 9.8 5.4
2 Texas Collin 862215 84.1 10 5.8
3 Texas Dallas 2485003 83.9 9.5 6.4
4 Texas Harris 4356362 83.4 10.1 6.3
5 Texas Andrews 16775 83.1 9.6 6.8
6 Texas Tarrant 1914526 83.1 11.4 5.4
7 Texas Titus 32553 82.5 10 7.4
8 Texas Denton 731851 82.2 11.9 5.7
9 Texas Ector 149557 82 11.2 6.7
10 Texas Moore 22281 82 11.7 5.9
# ... with 159 more rows

Mutate

counties_selected <- counties %>%
select(state, county, population, public_work) # Sort in descending order of the public_workers column
counties_selected %>%
mutate(public_workers = public_work * population / 100) %>%arrange(desc(public_workers))
counties %>%
# Select the five columns
select(state, county, population, men, women) %>%
# Add the proportion_men variable
mutate(proportion_men = men / population) %>%
# Filter for population of at least 10,000
filter(population >= 10000) %>%
# Arrange proportion of men in descending order
arrange(desc(proportion_men))

The count verb

counties_selected %>%count(region,sort=TRUE)
counties_selected %>%count(state,wt=citizens,sort=TRUE)

Summarizing

# Summarize to find minimum population, maximum unemployment, and average income
counties_selected %>%summarize(
min_population=min(population),
max_unemployment=max(unemployment),
average_income=mean(income)
)
# Add a density column, then sort in descending order
counties_selected %>%
group_by(state) %>%
summarize(total_area = sum(land_area),
total_population = sum(population),
density=total_population/total_area) %>%arrange(desc(density))

发现了,归根到底是一种函数关系,看看该怎样处理这个函数比较简单,如果写不出来,可能和小学的时候应用题写不出来有关系

top_n

按照优先级来筛选

# Extract the most populated row for each state
counties_selected %>%
group_by(state, metro) %>%
summarize(total_pop = sum(population)) %>%
top_n(1, total_pop)

Selecting

Using the select verb, we can answer interesting questions about our dataset by focusing in on related groups of verbs.

The colon (

最新文章

  1. Oracle数据库,模糊查询、去重查询
  2. php利用wsh突破函数禁用执行命令(安全模式同理)
  3. checkbox 赋值给js 变量
  4. UITableView的常用属性和cell的内存优化
  5. Spring MVC Checkbox And Checkboxes Example
  6. __block存储类型
  7. jQuery_第三章_工厂函数
  8. 文档模型(JSON)使用介绍
  9. linux下 git 安装
  10. OpenVPN client端配置文件详细说明(转)
  11. 51nod1237 最大公约数之和
  12. 升级NGINX支持HTTP/2服务端推送
  13. JS document.execCommand实现复制功能(带你出坑)
  14. install apache-activemq
  15. 找质数|计蒜客2019蓝桥杯省赛 B 组模拟赛(一)
  16. 【转载】非对称加密过程详解(基于RSA非对称加密算法实现)
  17. python day03--字符串
  18. 仿造mongodb的存储方式存一些假数据
  19. (转) DB2 HADR
  20. Elixir木蚂蚁支付服务器验签名方法

热门文章

  1. git rebase -- 能够将分叉的分支重新合并.
  2. 【Java】基于RXTX的Java串口通信
  3. python中class的定义及使用
  4. SpringBoot整合NoSql--(二)MongoDB
  5. css常用样式背景background如何使用
  6. 使用TableHasPrimaryKey或TableHasForeignKey来知道表是否有主键或外键
  7. 纪中5日T1 1564. 旅游
  8. JavaWeb开发图书管理系统(新本版)源码
  9. 剑指offer-面试题12-矩阵中的路径-回溯法
  10. [P5665][CSP2019D2T2] 划分