Emotional metadata analysis service

Technological Specification

What technology is important in entertainment analyze.

The most important thing in entertainment analysis is we have to approach to customer’s sensitive preference and goals besides general statistical analysis. In many cases the difference is small and we have to reflect information which might have been treated as noise. Recent trend of big data analysis is architect which is focused on real-time which hardware improvement. Sockets introduced high-process GPU computing approach in order to realize real-time analysis. What we consider most important is to generate analysis which fits human sense as much as real-time. Therefore we go back where we were at first, which is to comprehend vague data, analyzing natural language in collaboration of human and machine, and also implement system architect to leverage them.


Collaboration of data which has different characteristic: Cleansing / Matching technology

In big data analysis there are four types of input source; structured data, non-structured data, real-time data, vague data. Sockets has 10 years achievements of structuring these data in different forms into machine readable status, especially in matching and aggregation of names.

  • Data at Rest

    Structured data

    Basic information, emotional metadata, genre, archive data...
    Data processing in terabytes

  • Data in Many Forms

    Non-structured data

    Reviews, SNS, BLOGs, lyrics/lines, books...
    Natural language analysis, knowledge based craping...

  • Data in Motion

    Real-time data

    Streaming data, user action...
    Dispersion/GPU computing

  • Data in Doubt

    Vague data

    Vague, similar, incomplete, non-consistent, time-lag...
    Matching Technology

Architect overview : Technology and role

Architect stack of Sockets analysis system is divided in to two category in big data analysis as in the image below. The role is dispersion computing and data structure. By analyzing these roles completely, data analyst and infrastructure engineer who organize big data efficiently can function well. Important thing in data analyzing is to increase the opportunity of try and error.


Knowledge graph searching technology

Sockets database is full structured and also non-structured such as use action log and Web crawling data. These data are all linked by general key master, so cross category / cross media / cross experience are realized by id based hard link of music, video, music video to live event info, twitter buzz keyword and video titles. Also we can generate reasonable analysis results which is not derived just from statistical data.

Crawling technology

Sockets own original crawling technology. Hundreds of web scraping plug-in, which is developed during 10 years of experiences enables to derive data from various structure web pages without tuning crawler. Also just updating the difference by scheduler and 24hour system by independent bot. Also in data cleansing technology, original designing of white list and algorithm enables efficiency and organizing data cost at the same time with crawling.
Automatic decision of accuracy and priority is able by embedding logic to judge the feature of the web site.

Related Services

Technological Specification