Can tweets infer personality differences between iPhone and Android Users.

To begin answering this question, I first of all created an app on twitter:

Once made, in the application settings, I clicked where it said “manage keys and access tokens”. On this new page you can generate a consumer key and consumer secret, and you can also generate access tokens, and access token secrets. I needed these four codes during twitter authentication when retrieving tweets so they are important to write down. In the code below, you will need to add in your own authentification codes as I have removed mine for security reasons. Once this is done, the only thing you’ll need to change is the save destination of your files. The rest should run without editing as long as you have installed the required packages.

Then in R Studio, I used the following code to get tweets:


I then cleaned the tweets by removing re-tweets, removing duplicate users and removing repeated tweets. It was important to ensure that I did not have two tweets from the same user to get an independent sample.

Finally, I wanted to save this data, and have separate files for iPhone and Android tweets.


You should be left with three files. The data file containing both iPhone and Android tweets looks like this:


To analyse this text, I used the LIWC 2015 software to asses the psychological properties of iPhone and Android users tweets.

Because the twitter data is saved as a CSV file, we can individually analyse every person (row) of data. We don’t need to copy and paste the tweets in a word file to get an overall score for each dimension. This means every individual will have their own LIWC scores on each of its dimensions. This was useful for statistical analysis, which I will discuss further down in this post.


The output of the LIWC analysis looks like this once saved as an excel file:

To make this final CSV file ready for importing back into R, I replaced the headers Source (A), Source (B), Source (C), Source (D), with the titles underneath (id, source, text and created) and then deleted the empty row.

I then repeated these LIWC steps for the iPhone and Android data.

In particular I was interested in measuring whether Android and iPhone users differed on four summary variables. They reflect a 100 point scale ranging from 0 to 100. The algorithms behind these scores are not available due to the authors prior commercial agreements.  The variables include:

○ Analytical thinking – a high number reflects formal, logical, and hierarchical thinking; lower numbers reflect more informal, personal, here and now, narrative thinking.

○  Clout – a high number suggests that the author is speaking from the perspective of high expertise and is confident; low Clout numbers suggest a more tentative, humble, even anxious style.

○  Authentic – higher numbers are associated with a more honest, personal, and disclosing text; lower numbers suggest a more guarded, distanced form of discourse.

○  Emotional tone – a high number is associated with a more positive, upbeat style; a low number reveals greater anxiety, sadness, or hostility. A number around 50 suggests either a lack of emotionality or different levels of ambivalence.

All of these summary variables refer to writing styles, which are used to infer personality traits. When reading the associated papers:

The analytical thinking score was developed from data which measured 50,000 admission essays from a large state university across the years 2004-2007. They found that higher grades were associated with greater article (a, an, the) and preposition (to, above) use. Lower grades were associated with greater use of auxiliary verbs (is, have) , personal pronouns (I, her, they) , impersonal pronouns (it, thing), adverbs (so, really, very) conjunctions (and, but) and negotiations (no, never). This was used to develop the categorical – dynamic index (CDI) using principle component analysis. This is a bipolar scale because the more students used articles and prepositions, the less they used pronouns and other functions words. One one end is categorical language “which combines heightened abstract thinking (associated with greater article use) and cognitive complexity, (associated with greater use of prepositions). A lower CDI involves greater use of auxiliary verbs, adverbs, conjunctions, impersonal pronouns, negations, and personal pronouns. These word categories, particularly pronouns and auxiliary verbs have been associated with more time-based stories, and reflect a dynamic or narrative language style.” They later found that higher CDI scores were associated with higher academic performance when measuring GPA (grade point average).

Pennebaker, J. W., Chung, C. K., Frazee, J., Lavergne, G. M., & Beaver, D. I. (2014). When small words foretell academic success: The case of college admissions essays. PLoS ONE, 9(12), 1–10.

The clout score was developed from 5 studies which measured language differences between people of different ranks or status. In study 1, a leader was randomly assigned to a group through a bogus leadership questionnaire and for 30 minutes the group had to agree on a series of decisions. In study 2, participants worked in pairs to solve complex problems over instant messenger and were asked to self-report perceived power through the questions “To what degree did you control the conversation” and “to what degree did you have power in the conversation”. In study 3 people talked face to face about everyday topics and similar to study 1, the conversational transcripts were transcribed, and they were asked to rate their self-perceived power in the same way as study 2. Study 4 measured emails between participants and their correspondents, and rated their own status relative to each of their chosen correspondents using the scale 1 – other has much lower status and 7 = other has much higher status. Study 5 measured military letters between soldiers of the iraqi military associated with Saddam Hussein’s regime. They found that overall, pronoun use reflects position in social hierarchy. First person singular pronouns (I, me) were associated with lower status and suggest more self-attention. First person plurals (we, us) and second person singular pronouns (you, your) were used more by those with higher status.

Kacewicz, E., Pennebaker, J. W., Davis, M., Jeon, M., & Graesser, A. C. (2014). Pronoun Use Reflects Standings in Social Hierarchies. Journal of Language and Social Psychology, 33(2), 125–143.

The Authentic score was developed across 5 studies which compared the linguistic properties of false stories and true stories. In study 1, participants were taped discussion both true and false views on abortion, and were asked to be as believable as possible. In study 2, participants were asked to type both true and false views on abortion, and were encouraged to be as persuasive as possible. Study 3 hand wrote both true and false views on abortion and again were asked to be as truthful and deceptive as possible. In study 4, participants were asked to provide verbal true and false descriptions about people they truly liked and disliked and were again asked to be honest and convincing. Study 5 was a mock crime scenario whereby half the participants were asked to look around a room and the other half were told to steal a dollar bill. All participants were accused of taking the money and were told to deny this accusation. They were told that if the interrogator was convinced of their innocence, they would get the dollar bill. Across all the studies, liars used first-person singular pronouns (I, my, me) at a lower rate than truth tellers. Secondly, liars used negative words (hate, worthless, enemy) in greater amounts than truth-tellers. Third, liars used fewer exclusive words (but, except, without) normally associated with cognitive complexity and are used to make reference to what is in a given category and what is not. Liars use third person pronouns (he, she, they) at a lower rate. The research found LIWC’s classified liars and truth tellers at a rate of 67% accuracy.

Newman, M. L., Pennebaker, J. W., Berry, D. S., & Richards, J. M. (2003). Lying words: predicting deception from linguistic cues. Personality and Social Psychology Bulletin, 29(5), 665–675.

The emotional tone score was developed by analysing the diaries of 1084 online journal users for a period of 4 months, two weeks before and two weeks after the September 11th attacks in 2001. When analysing positive and negative emotional words, the September 11 attacks reduced positivity on average by 1.36 standard deviations, and this increased over the next week monotonically until it returned to baseline. This bipolar emotional positivity scale was calculated as the difference between LIWC scores for emotion words (happy, good, nice) and negative emotion words (kill ugly, guilty). Higher scores = greater positivity.

Cohn, M. A., Mehl, M. R., & Pennebaker, J. W. (2001). Linguistic Markers of Psychological Change. Psychological Science, 15(10), 687–694.

To compare iPhone and Android users on these traits, I re-imported my data back into R.

Then I calculated averages to explore the data:

The data consisted of 1027 Android user tweets and 2209 iPhone user tweets. On average, 55.57 of words used in the tweets were recognised by the LIWC dictionary. The average word count for both iPhone and Android user tweets was 8.5 words.

Next I conducted four t tests and resultant r effect size calculations to see if there were differences in the writing styles of Android and iPhone users.

These are the results:

Mean of X is referring to iPhone users and mean of Y is referring to Android.

As this sample is just a snippet of data available on twitter, it might be worth re-running the experiment several times and conducting a meta analysis afterwards. We have found that Android users are more analytical and iPhone users more authentic using this measure.