Social Media Monitoring

Social Media Monitoring

Posts 1-10 of 19
  • Prof. Dr. Urs E. Gattiker
    Prof. Dr. Urs E. Gattiker    Premium Member   Group moderator
    The company name is only visible to registered members.
    A while back I wrote a blog post outlining some test results regarding various text analysis or text analytics packages.
    What was interesting was that most failed to deliver the beacon regardless what benchmark we used:

    ===> http://commetrics.com/articles/fails-validity-test/ (it is simply amazing what rubbish they are trying to sell)

    Today I came across a blog post by Themos Kalafatis that defines text analytics as follows:

    "In order for a computer to 'understand' unstructured text, it should be 'taught' that the word 'Dollar' is a currency of a country that is called 'US' and also that US, United States, USA and U.S.A is the same concept. This means that hundreds of thousands of concepts and synonyms have to be specified so that a computer identifies them in unstructured text. This process is called Text Annotation."

    The above indicates that if you were to develop a tool that can determine sentiments in text appropriately you first need to do a tremendous amount of text annotation .... x the number of languages you want to do analytics for (e.g., English, Spanish, Chinese). So much work has focused on English and we are still trying to get it right it looks like.

    Themos also points out that :

    "... since unstructured text becomes more available there will be a greater need for 'annotation farms' : Groups of people who will be manually annotating free text, identifying an ever-growing number of Companies, Managers, Politician names, or anything else that has to be 'taught' to a computer. Note that Annotation Farms already exist but the need for this service will become greater."

    The above, of course, suggests that this will be a costly exercise to do. Moreover, outsourcing might not work very well because English is not English as we all know (e.g., use of Eurospeak in Brussels differs a bit from Queen's English). Accordingly, if an employee in a far away place is able to understand these cultural nuances in the language properly must be questioned.
    Considering local dialects, slang, abbreviations (e.g., think tweets) make it difficult to determine if a sentiment is positive, negative or indifferent without human labor being involved in one way or another.

    But if outsourcing is not an option - be ready to pay a lot to get decent results.

    The future and trends of Text Analytics ===> MUST READ ===> http://kl.am/cNrT (short and sweet)

    Hence, quality comes at a price.... either pay or walk away... So we still have to wait for sentiment analysis that does the trick with a certain degree of validity (e.g., it measures what it is supposed to measure). And even if it will arrive be ready to pay. In the meantime, don't hold your breadth it will take another few years before the tool arrives that meets your needs.

    Questions for all of us
    1 Have you used text analysis tools - what is your opinion?
    2 How do you address sentiment analysis in your work - are your clients satisfied with the way data is collected and presented/analyzed?

    Please share.. thank you for your help.
  • Post visible to registered members
  • Prof. Dr. Urs E. Gattiker
    Prof. Dr. Urs E. Gattiker    Premium Member   Group moderator
    The company name is only visible to registered members.
    Sebastian

    Thanks so much for taking time out of your busy schedule to answer my post.

    For starters, your reply is a perfect example of how difficult it is to convey a message in plain English. I surely failed with this one and for this I apologise.
    Agreed, the dollar example I used from Themos Kalafatis (quoted in fact) was probably a bit an easy one used by him to make a point.
    But as a teacher I liked his example because he was illustrating his point for the non-initiated. Little good it did me since you called me to task on this :-)

    So let me re-state and say: I was definitely not trying to suggest that we should not use sentiment analysis or text analytics. Simply, I came across a blog post that summarised some issues we should be aware of .... it caught my interest.

    The blog post reminded me that we have been trying to address how to mechanize if not automatize text analysis for about the last 30 years. In fact, ever since the PC started to come our way around 1978 with the Apple II text analysis has of interest to an ever greater group of people. And we are still trying to get a handle on it.

    For instance, you state:

    "... Why not accept that for the moment sentiment analysis has some uncertainty (in about 10-15% of the cases the result is missleading -- what are the results for humans anyway) and learn to deal with it.

    Well I am not sure where you got your numbers from.
    In fact if you had a chance to read my post on the issue, validity is below 50% depending what text you try to address (vocabulary used, context, cultural nuances, etc.) and what tool you are trying to apply to the challenge.

    In turn, my clients are not willing to deal with this margin of error (that is what they tend to call it) and from a scientific point of view it is not that satisfactory either.

    Finally, my point is that in my experience while software vendors might try to sell me snake oil .... I prefer to do a manual check to see and make sure that the information (e.g., charts, tables) shown to me do represent the data collected.
    Hence 2 humans cross-checking some of the findings we get from a software package always reveals some interesting discrepancies.
    And yes, more often than not two humans come to a vastly richer and different interpretation of Facebook posts or tweets on Twitter and how this affects marketing buzz or branding than the software can ever reveal.

    Nevertheless, I agree with you that we should go on and try to solve this problem. But I hope you agree with me as well that we still have quite a journey in front of us before we reach our final destination.

    Thanks for sharing.
  • Post visible to registered members
  • Prof. Dr. Urs E. Gattiker
    Prof. Dr. Urs E. Gattiker    Premium Member   Group moderator
    The company name is only visible to registered members.
    Sebastian

    Thanks again for responding you quote:

    "... Why not accept that for the moment sentiment analysis has some uncertainty (in about 10-15% of the cases the result is missleading -- what are the results for humans anyway) and learn to deal with it.
    Well I am not sure where you got your numbers from."

    Then you respond and I quote:
    "These numbers are quoted from sentiment detection research reports (a simple google search should reveal a large portion of them). Most of the numbers were obtained in formal experimental settings with a sufficient number of test cases and a reproducible set up . Simply the way experiments should be conducted ..."

    Sebastian, why not give our members some specific references including urls to those reports that you apparently have access to?
    Please provide me with a link to these papers that specifically demonstrate:

    - 10-15% of the cases the result is misleading, AND
    - above numbers as obtained in experimental settings with sufficient number of test cases (as you state)

    I would appreciate it because I cannot find them doing a careful search as you suggested.
    I am sure that with your gracious help we can provide these resources to our members. I can guarantee you that me and several members will read this material very carefully.
    Thanks so much for your support.

    Have a great Pentecoste holiday.
  • User photo
    Themos Kalafatis
    The company name is only visible to registered members.
    Dear All,

    I would like to give my two cents on this thread. I have been using Text Mining and Information Extraction for the past 4 years and this is what i have learned so far :

    Sentiment Analysis is a hard problem

    This means that you can easily achieve a 55% precision with some work. If you want to achieve 70% then you must try hard. Anything above 80% is achieved with really hard work and only IF the problem that you wish to solve is an easy one.

    The more focused you are at the problem you are trying to solve the better precision you get. This has happened to me over and over again. So for example, if one wishes to extract the sentiment for a Telecom company on Twitter then he/she should break down the problem in specific chunks such as

    -Detect sentences about the signal
    - Detect sentences about the charges (incorrectly charging more)
    -Detect sentences about Billing plans
    ...etc

    and then work out first in detecting the sentences above (and their boundaries). I have successfully done that using machine learning with 83% - 85% precision (Text Classification)

    However the next step (=extracting meaning from each sentence) is a hard one. (Usually around 70-73%)


    However....Usage of many techniques together has proven itself over and over again : Cluster Analysis, Keyword co-occurences and Sentiment Analysis can look at the same problem with different points of views and provide to the customer with real insights.

    Hope i helped somehow. Here are some links if you wish to look further on some of the things i discussed :


    http://lifeanalytics.blogspot.com/2009/10/sentiment-on-us-ec...

    http://lifeanalytics.blogspot.com/2009/06/how-habitat-uk-sho...

    http://lifeanalytics.blogspot.com/2008/12/when-telecom-custo...

    http://lifeanalytics.blogspot.com/2009/05/twitter-analytics-...



    Best,


    Themos
  • Post visible to registered members
  • Prof. Dr. Urs E. Gattiker
    Prof. Dr. Urs E. Gattiker    Premium Member   Group moderator
    The company name is only visible to registered members.
    Sebastian

    Thanks for the help regarding sentiment analysis or text analytics and research papers.
    Of course since we always provide the link to the material to save everybody time I thought I add the URLs as well to your wonderful suggestions.
    Hence, I went out and searched and this is what I found:

    A survey on sentiment detection of reviews
    Huifeng Tanga, Songbo Tan and Xueqi Chenga, Expert Systems with Applications
    Volume 36, Issue 7, September 2009, Pages 10760-10773

    =====>>>> http://portal.acm.org/citation.cfm?id=1539436 (ACM membership login needed)

    Delta TFIDF: An Improved Feature Space for Sentiment Analysis
    Justin Martineau and Tim Finin, Proceedings of the Third AAAI Internatonal Conference on Weblogs and Social Media (2009) Volume: 29, Issue: May, Publisher: AAAI Press, Pages: 490–497

    ====>>>> http://www.mendeley.com/research/delta-tfidf-an-improved-fea... (free login needed)

    Thumbs up? Sentiment Classification using Machine Learning Techniques
    Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan.
    Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 79--86, 2002.

    ====>>> http://www.cs.cornell.edu/home/llee/papers/sentiment.home.ht... (just download)

    Love the link to the data sets you provided so I repeat it here:

    ====> http://www.cs.cornell.edu/people/pabo/movie-review-data/
    This post was modified on 27 May 2010 at 03:03 pm.
  • Post visible to registered members
  • Ruud Janssen
    Ruud Janssen    Group moderator
    The company name is only visible to registered members.
    Dear Academics,
    Interesting reading but you lost me...

    I;m a practicle handson marketeer with the need to measure performance of social media campaigns for live & hybrid events.

    At a recent conference (Masterclass Social Media in Amsterdam) I had the opportunity of using social mention as a way to guage the actual sentiment around remarks in a topic or event. Sofar I'm just experimenting with it but was wondering what your remarks are regarding the use of such platforms.

    Here is the event: http://www.masterclasssocialmedia.nl/

    Here is the social mention string and measurement: http://socialmention.com/search/?t=all&q=%23msm10nl&...

    I did notice the score vary widely over time. Onsite at the event the scores seemed much higher and they are difficult to track.

    I also wanted to checkin and request an update on New Media Metrics Dashboards. I've been using Unilyzer (http://www.unilyzer.com) powered by Google Analytics. Do you have any recommendations or experiences you would like to share on these 2 topics?
 
Sign up for free: