给定a、b两个文件,各存放50亿个url,每个url各占64字节,内存限制是4G,让你找出a、b文件共同的url?
2024-08-26 12:51:06
package com.hadoop.hdfs; import org.apache.hadoop.yarn.webapp.hamlet.Hamlet;
import org.junit.Test; import java.io.*;
import java.util.HashMap;
import java.util.HashSet; public class Suanfa1 {
@Test
public void a1() throws IOException {
BufferedReader bufferedReader = new BufferedReader(new FileReader("D:/aa.txt"));
// BufferedWriter bufferedWriter = new BufferedWriter(new FileWriter("D://"))
String str1 = "";
while ((str1 = bufferedReader.readLine())!=null){
int i = (int) (hashCode(str1)%1000);
BufferedWriter bufferedWriter = new BufferedWriter(new FileWriter("D://aa"+String.valueOf(i)+".txt"));
bufferedWriter.write(str1);
bufferedWriter.close();
System.out.println(i);
}
bufferedReader.close();
} public void a2() throws IOException {
BufferedReader bufferedReader = new BufferedReader(new FileReader("D:/bb.txt"));
// BufferedWriter bufferedWriter = new BufferedWriter(new FileWriter("D://"))
String str1 = "";
while ((str1 = bufferedReader.readLine())!=null){
int i = (int) (hashCode(str1)%1000);
BufferedWriter bufferedWriter = new BufferedWriter(new FileWriter("D://bb"+String.valueOf(i)+".txt"));
bufferedWriter.write(str1);
bufferedWriter.close();
}
bufferedReader.close();
} public long hashCode(String str) {
long h = 0;
if (h == 0) {
int off = 0;
char val[] = str.toCharArray();
long len = str.length();
for (long i = 0; i < len; i++) {
h = 31 * h + val[off++];
}
}
return h;
} @Test
public void a3() throws IOException {
a1();
a2();
for (int i = 0; i < 1000; i++) {
BufferedReader bufferedReader1 = new BufferedReader(new FileReader("D://aa"+String.valueOf(i)+".txt"));
BufferedReader bufferedReader2 = new BufferedReader(new FileReader("D://bb"+String.valueOf(i)+".txt"));
HashSet set = new HashSet();
String input1 = "";
while ((input1 = bufferedReader1.readLine())!=null){
set.add(hashCode(bufferedReader1.readLine()));
} String input2 = "";
while ((input2 = bufferedReader2.readLine())!=null){
if (set.contains(hashCode(input2))){
System.out.println(input2);
}
}
} }
}
最新文章
- 【HTML5】Web Audio API打造超炫的音乐可视化效果
- error-2015-9-9
- phpstorm配置svn
- 如何开启mysql计划事件
- js文档视口高度函数
- android动画小析
- 数据库中间件mycat简单入门
- CCNA的RIP路由学习
- 【转】python中的lambda函数
- POJ 3648-Wedding(2-SAT)
- 用android-x86模拟器不能运行程序错误Tag:libc的问题
- hdu4597Play Game(记忆化)
- 类加载器classCloader
- cocos2d-x-3.1在eclipse中的环境搭建
- Mono for Andriod学习与实践(1)— 初体验
- Linq skip skipwhile take takewhile
- 找到多个与名为“Home”的控制器匹配的类型。解决方法
- ldap配置系列二:jenkins集成ldap
- 使用pickle模块存储对象
- Docker 微服务教程