pandas.resample()
http://www.cnblogs.com/hhh5460/p/5596340.html
resample与groupby的区别:
resample:在给定的时间单位内重取样
groupby:对给定的数据条目进行统计
函数原型:
DataFrame.resample(rule, how=None, axis=0, fill_method=None, closed=None, label=None, convention='start', kind=None, loffset=None, limit=None, base=0)
其中,参数how已经废弃了。
下面开始练习
import numpy as np
import pandas as pd
Start by creating a series with 9 one minute timestamps.
index = pd.date_range('1/1/2000', periods=9, freq='T')
series = pd.Series(range(9), index=index)
Downsample the series into 3 minute bins and sum the values of the timestamps falling into a bin.
series.resample('3T').sum()
To include this value close the right side of the bin interval as illustrated in the example below this one.
series.resample('3T', label='right').sum()
Downsample the series into 3 minute bins as above, but close the right side of the bin interval.
series.resample('3T', label='right', closed='right').sum()
Upsample the series into 30 second bins.
series.resample('30S').asfreq()
Upsample the series into 30 second bins and fill the NaN values using the pad method.
series.resample('30S').pad()
Upsample the series into 30 second bins and fill the NaN values using the bfill method.
series.resample('30S').bfill()
Pass a custom function via apply
def custom_resampler(array_like):
return np.sum(array_like)+5 series.resample('3T').apply(custom_resampler)
附:常见时间频率
A year
M month
W week
D day
H hour
T minute
S second
最新文章
- C#与XML Schema的问题
- Hadoop Eclipse开发环境搭建
- Facebook存储技术方案:找出“暖性BLOB”数据
- YII中引用自定义类
- Android APK反编译具体解释(附图)
- Treasure Hunt - POJ 1066(线段相交判断)
- Winform使用DevExpress的WaitDialogForm画面
- linux中bin和xbin下可执行程序的区别
- Centos 5.5下安装samba
- HDOJ 1043 Eight(A* 搜索)
- SQL注入(一)普通型注入
- Android 消息传递之Bundle的使用——实现object对象传输(一)
- HUST 1376 Random intersection
- jenkins构建基于gradle的springboot项目CI采坑(采用jar方式部署)
- HTML学习笔记3:文字和段落
- mysql: [ERROR] unknown variable 'datadir=/var/lib/mysql'问题
- over()的用法
- Sprint 冲刺第三阶段第二天
- PAT 1071 小赌怡情
- Maven编译错误记录:Some Enforcer rules have failed