Create A Quora Your Parents Would Be Proud Of
The topics in Quora allow the community to provide good quality responses to the questions as well as to organize them properly. To obtain question similarity between the topics, we first generate the tf-idf vectors for the questions for both the topics. We then compute the similarity between two topics as the cosine similarity between the POS tag vectors of the two topics. Topic name in question text: If two topics are describing the same concept, then there is a high probability that this topic name will be present in the question text of the other topic. We conduct human judgment experiment to evaluate how our model performs compared to the case where human subjects (regular Quora users) are tasked to predict the merges. On the other hand, the human judgment wrongly predicted that these two topics should merge. Next, we compare the accuracy of our model against human judgment based on the familiarity of the topics.
We obtained 822,040 unique questions across 80,253 different topics with a total of 1,833,125 answers to these questions. Note that most of the measures that we define are simple, intuitive and can be easily obtained automatically from the data (without manual intervention). We have observed in the previous section that these linguistic as well as psycholinguistic aspects of the question asker are discriminatory factors. Content of a question text is important to attract people. Another direction of research focuses on the quality of the user generated content in Q&A sites that includes quality of questions (?; ?) and quality of answers (?; ?; ?). In this paper, we studied the phenomena of competing conventions in Quora topics and proposed a model to predict whether two given topics should merge, as well as the direction of the topic merge. However, these two topics were indeed later merged on 19 Jan 2017. Another such interesting example that we found is ‘Psycho-2’ merged into ‘Psychology-of-Everyday-Life’ on 22 Feb 2017. Automatic enumeration of such correct ‘false positives’ would be an immediate future step. Just like two topics can be merged to form a single topic, two merged topics can also be ‘unmerged’ to the original topics back again.
These quantities encoding such linguistic activities can be easily measured for each question post. In table 2, we show a collection of examples of open questions to illustrate that many of the above quantities based on the linguistic activities described in this section naturally correspond to the factors that human judges consider responsible for a question remaining unanswered. ‘ICC’.. Then we obtain the set of questions that were posted before the merge for both the source and the destination topic. Figure 2a shows an example where a user merges the topic ‘Trade Union’ (source topic) into ‘Labor Unions’ (destination topic). Figure 5c shows the distribution of the number of questions in ‘Topic A’ and ‘Topic B’ before they were merged. Figure 5a shows the distribution of the number of characters in ‘Topic A’ and ‘Topic B’. Let us assume that ‘Topic A’ gets absorbed into ‘Topic B’ to form the new ‘Topic B’, i.e., the winning convention. Initially, we remove ‘trivial’ merges, i.e., those merges which are minor lexical variants (e.g., plural forms).
A question asker’s linguistic, emotional, cognitive states are also revealed through the language he/she uses in the question text. Our central finding is that the way users use language while writing the question text can be a very effective means to characterize answerability. To find out if there are question text overlaps, we extract 1, 2, 3 and 4-grams from the questions belonging to the topic pairs. These are then passed to the supervised classifier for prediction. We perform a 10-fold cross-validation with SVM classifier. Further, we calculate the overlap coefficient of the co-occurring topics of each pair. Figure 3c shows the distribution of the unweighted overlap coefficients of the co-occurring topics. We first find the five most frequent co-occurring topics of each pair. T for the similarity, above which we would call a topic pair as merge and below which it will be a non-merge. Predicting the direction of topic merge: We use the above discriminating factors in a standard classification framework to predict the direction of topic merge. In the following we discuss the factors instrumental in influencing the direction of topic merge. Our system could also act as a filter that could provide the scarce topical experts with possible topic pairs, thus helping in easing the cognitive burden on these experts. This was created by GSA C on te nt Gener ator D em over sion.