Web Scraping on Instagram – Counts of posts, followers and following
Was browsing this developers GitHub and gave his package ‘iscrape’ a go: https://github.com/royfrancis/iscrape
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 |
#Removes everything from the working directory rm(list=ls()) #install depency packages install.packages(c("devtools","dplyr","httr","stringr"),dep=T) #install package iscrape devtools::install_github("royfrancis/iscrape") #load package iscrape library("iscrape") #define user pageuser <- get_page_user("h_shawberry") #get the users 1. Post Count, 2. Followers, 3. Following get_count_post(pageuser) get_count_follower(pageuser) get_count_following(pageuser) #exploring hastags. Define what hashtag webpage to scrape from hashtagpage <- get_page_hashtag("research") #count number of posts in hashtag get_count_hashtag(hashtagpage) #function to get this information from many users names <- c("h_shawberry","nintendo_travel_insta","lincoln_psychology") len <- length(names) klist <- vector("list",length=len) for(i in 1:len) { cat(paste0("\nRunning ",i," of ",len,"; ",names[i],"; ")) pu <- get_page_user(names[i]) pcount <- get_count_post(pu) cat(pcount," ") fcount <- get_count_follower(pu) cat(fcount," ") focount <- get_count_following(pu) cat(focount,"; ") ph <- get_page_hashtag(names[i]) hcount <- get_count_hashtag(ph) cat(hcount,";") klist[[i]] <- data.frame(name=names[i],posts=pcount, followers=fcount,following=focount, hashtagcounts=hcount,stringsAsFactors=F) # variation in timing page request Sys.sleep(sample(1:6,1,replace=T)) } dplyr::bind_rows(klist) |
Here is the outcome of the function: name posts followers following hashtagcounts 1 h_shawberry 837 236 389 […]
» Read more