环境

Anaconda3 Python 3.6, Window 64bit

  • 目的

从MySQL数据库读取数据,进行数据查询、关联

  • 代码
# -*- coding: utf-8 -*-

"""
Author: kimbo zhang
Mail: kimbo_zhang@163.com
""" import pymysql
import pandas as pd
import numpy as np db_conn = pymysql.connect(
host="***",
user="kimbo",
passwd="***",
port=3306,
database="kimbo_test",
charset="utf8"
) # 执行sql语句
sql_cmd = "select id,coll_type,coll_amt,coll_cost from kimbo_test;"
sql_cmd2 = "select id,re_type as coll_type,re_amt as coll_amt,re_cost as coll_cost from kimbo_test2;" # 导入数据
data1 = pd.read_sql(sql_cmd, db_conn) # 赋值 kimbo_test
data2 = pd.read_sql(sql_cmd2, db_conn) # 赋值 kimbo_test2 # 1. select 查询
# eg: select * from kimbo_test limit 5;
sl = data1.head(10)
print("查询前10条记录:\n")
print(sl)
print("\n-------------->\n") # 2. 查询类型是:偏低的数据
# eg: select * from kimbo_test where coll_type='偏低';
wh = data1[(data1['coll_type'] == '偏低')]
print(wh) # 3. 查询类型是:偏低和昂贵的数据
# eg:select * from kimbo_test where coll_type in ('偏低','昂贵');
wh2 = data1[(data1['coll_type'] == '偏低') | (data1['coll_type'] == '昂贵')]
print(wh2) # 4. 查询类型是:偏低和昂贵,id 大于7000
# eg: select * from kimbo_test where coll_type in ('偏低','昂贵') and id >5000;
wh3 = data1[((data1['coll_type'] == '偏低') | (data1['coll_type'] == '昂贵')) & (data1['id'] > 5000)]
print(wh3) # 5. 分组、汇总
# eg: select coll_type,sum(coll_amt)as coll_amt,sum(coll_cost)as coll_cost from kimbo_test group by coll_type;
gb = data1.groupby('coll_type').agg({'coll_amt': np.sum, 'coll_cost': np.sum})
print(gb) # 5. 关联 inner join
# eg: select * from kimbo_test a inner join kimbo_test2 b on a.id=b.id ;
jn = pd.merge(data1, data2, on='id')
print(jn) # 6. 关联 left join
# eg: select * from kimbo_test a left join kimbo_test2 b on a.id=b.id ;
jn2 = pd.merge(data1, data1, on='id', how='left')
print(jn2) # 7. 关联 full join
# eg: select * from kimbo_test a full join kimbo_test2 b on a.id=b.id ;
jn3 = pd.merge(data1, data1, on='id', how='outer')
print(jn3) # 8. 合并 union all ,列名需要一致
# eg select * from kimbo_test a union all select * from kimbo_test2 b ;
un2 = pd.concat([data1, data2])
print(un2) # 9. 分析函数
# eg:
# SELECT * FROM (
# SELECT
# t.*,
# ROW_NUMBER() OVER(PARTITION BY coll_type ORDER BY coll_amt DESC) AS rn
# FROM kimbo_test t
# )
# WHERE rn < 3
# ORDER BY rn;
rn = data1.assign(rn=data1.sort_values(['coll_amt'], ascending=False).groupby('coll_type').cumcount() + 1
).query('rn < 3').sort_values(['coll_type', 'rn'])
print("row_number 分析函数结果:")
print(rn)
  • 结果展示

第一个和最后一个结果:

最新文章

  1. HDU 2202 计算几何
  2. OpenStack 企业私有云的若干需求(9): 云管理平台 CMP
  3. Nginx [emerg]: bind() to 0.0.0.0:80 failed (98: Address already in use)
  4. No.012 Integer to Roman
  5. 基于ASP.NET MVC和Bootstrap搭建响应式个人博客站(一)
  6. jq不识别拼接的对象id的解决方案
  7. js href的用法
  8. 我跟着siki学Unity3D游戏开发——PongGame
  9. 长方柱类【C++ 类定义】
  10. LeetCode 319. Bulb Switcher
  11. python 简单验证码 random模块
  12. WeQuant交易策略—MACD
  13. VxWorks:添加自己组件到Tornado
  14. Css Secret 案例全套
  15. python使用@property
  16. nodejs搭建web服务器
  17. 移动端web开发的注意点大总结
  18. Django-2.1基础操作
  19. maven 添加jdbc6
  20. 委托、Lambda表达式、事件系列02,什么时候该用委托

热门文章

  1. 我有一台 PC,上面有摄像头,怎么进行一场直播?
  2. Docker Libnetwork driver API
  3. HTML(form标签)、CSS
  4. web项目的getContextPath()
  5. jquery关于select框的取值和赋值
  6. 学点TCPDUMP
  7. html-2, a img ul li ol dl dt dd 标签与列表标签的简单使用
  8. PAT 天梯赛 L1-004. 计算摄氏温度 【水】
  9. 杭电1027Ignatius and the Princess II模拟
  10. 自动化测试调查问卷送《QTP自动化测试最佳实践》