加入收藏 | 设为首页 | 会员中心 | 我要投稿 核心网 (https://www.hxwgxz.com/)- 科技、建站、经验、云计算、5G、大数据,站长网!
当前位置: 首页 > 大数据 > 正文

《hadoop进阶》PeopleRank从社交关系中挖掘价值用户

发布时间:2021-03-06 16:13:32 所属栏目:大数据 来源:网络整理
导读:转载请注明出处: 转载自? Thinkgamer的CSDN博客: blog.csdn.net/gamer_gyt 代码下载地址:点击查看 1:PageRank 与 PeopleRank 2:需求分析:挖掘CSDN博客的价值用户 3:算法模型:PeopleRank算法 4:架构设计:从数据准备到PR算法的MR化 5:程序开发:had

《hadoop进阶》PeopleRank从社交关系中挖掘价值用户


下面只对部分代码进行展示,更多请前往github下载:点击查看

dataEtl.java

package pagerankjisuan;

import java.io.BufferedReader;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;

public class dataEtl {

	public static void main() throws IOException {

		File f1 = new File("MyItems/pagerankjisuan/people.csv");
		if(f1.isFile()){
			f1.delete();
		}
		File f = new File("MyItems/pagerankjisuan/peoplerank.txt");
		if(f.isFile()){
			f.delete();
		}
		//打开文件
		File file = new File("MyItems/pagerankjisuan/day7_author100_mess.csv");
		//定义一个文件指针
		BufferedReader reader = new BufferedReader(new FileReader(file));
		try {
			String line=null;
			//判断读取的一行是否为空
			while( (line=reader.readLine()) != null)
			{
					String[] userMess = line.split( "," );
					//第一字段为id,第是个字段为粉丝列表
					String userid = userMess[0];
					if(userMess.length!=0){
							if(userMess.length==11)
							{
									int i=0;
									String[] focusName = userMess[10].split("|"); //  | 为转义符
									for (i=1;i < focusName.length; i++) 
										{
											write(userid,focusName[i]);
//											System.out.println(userid+ "           " + focusName[i]);
										}
							}
							else
							{
									int j =0;
									String[] focusName = userMess[9].split("|"); //  | 为转义符
									for (j=1;j < focusName.length; j++) 
									{
										write(userid,focusName[j]);
//										System.out.println(userid+ "           " + focusName[j]);
									}
							}		
					}
				}
			} 
			catch (FileNotFoundException e) {
				// TODO Auto-generated catch block
				e.printStackTrace();
			}
			finally
			{
					reader.close();
				
					//etl peoplerank.txt
					for(int i=1;i<=100;i++){
						FileWriter writer = new FileWriter("MyItems/pagerankjisuan/peoplerank.txt",true);
						writer.write(i + "t" + 1 + "n");
						writer.close();
					}
			}
			System.out.println("OK..................");
	}

	private static void write(String userid,String nameid) {
		// TODO Auto-generated method stub
		//定义写文件,按行写入
		try {
			if(!nameid.contains("n")){
				FileWriter writer = new FileWriter("MyItems/pagerankjisuan/people.csv",true);
				writer.write(userid + "," + nameid + "n");
				writer.close();
			}
		} catch (IOException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}
	}

}

prjob.java

(编辑:核心网)

【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!

热点阅读