Rencently, my two teammates and I is doing a project, a simplified Chinese search engine for children(in primary school). We call it "kidsearch".

Since our project will be based on Baidu search engine. I'd like to have a simple analysis of Baidu search engine.

First, Baidu is not for children to use totally. Baidu, as a commercial company, provides the public a free service of searching. It is natural that not all the contents shown on the search engine are what people need. Some of them are shown because of benefits and some other factors.Perhaps it doesn't have a great impact on adults who can distinguish the contents of good or bad. But the impact will be obvious when it comes to children. For example,we can search these keys on Baidu : "波"(notice its pictures),"交换群"(notice its results),"医院"(notice its advertisements). And these are some normal words. Don't mention the results of some even worse key words. These results of searching not just inappropriate, some of them even harmful. So, the situation has to be fixed, which is also the purpose of our project "kidsearch".

Actually, seaching on the Internet for children is easier to that for adults. So the problem is also simplified. We can just use Baidu as a tool(not exagerated), rearrange the result, fix the inproper or useless entries, and add some contents suitable for children. The search engine will be really better for children after we do some fix on it.

So, what are the contents appropriate to children?

Based on the thoughts above, I concluded the requirements of children, which are what children may need.(Perhaps it doesn't cover all at present and we will perfect it in the future)

1.Notion -- encyclopedia

2.Material -- picture, music, video

3.Entertainment -- game

4.Study -- homework, knowledge

Moreover, there are some kinds of content that children don't need:

1.advertisement

2.adult(mature) content

3.sexual or homosexual content

4.sidebar(ad. or adult content or useless for children mostly)

Now that we have known what children need, what we should do next is to tackle them one by one.

What the technology we will use?

After tried many approaches, such as PHP, Java, Python, etc. I decided to use Python to do this job because it's really convenient to do the crawl job. Although it is a bit more difficult to make webpages than PHP, it doesn't matter too much.

Besides, there are huge amount of extended library to use with Python, such as requests, flask, django, jieba, etc. I have tried all of them preliminarily.

More details will be illustrated later. And our aim is to create a search engine which children can use and like to use.

最新文章

  1. Java 动态生成复杂 Word
  2. js动画之链式运动
  3. noip2012-day2-t2
  4. c语言内存对齐问题
  5. Linux Oracle碰到错误:ORA-27101: shared memory realm does not exist
  6. 11.14 noip模拟试题
  7. js 中对象--对象结构(原型链基础解析)
  8. According to TLD or attribute directive in tag file, attribute value does not accept any expressions
  9. ASM - 条件判断
  10. windows控件理论学习
  11. oracle 多版本
  12. 010 异步处理Rest服务
  13. yii2.0如何优化路由
  14. linux下修改时间和时区
  15. php中类继承和接口继承的对比
  16. 【洛谷P1823】音乐会的等待 单调栈+二分
  17. Noip前的大抱佛脚----根号对数算法
  18. ceilometer 源码分析(polling)(O版)
  19. windows多线程(三) 原子操作
  20. javaweb分页

热门文章

  1. OK335xS PMIC(TPS65910A3A1RSL) reset
  2. dialog组件
  3. apache开源项目-- Turbine
  4. mysql的MMM高可用方案
  5. 配置ORACLE 客户端连接到数据库
  6. BasicDataSource配备
  7. 解决32位plsql客户端连接不64位Oracle11g上数据库
  8. js获取浏览器基本信息:document.body.clientWidth/clientHeight/scrollWidth/scrollTop。(转)
  9. (一)使用Blender导出GameMaker支持的模型脚本
  10. res/raw和assets的 区别