Comparative Analysis of Measures of Similarity and Semantic Relatedness for Text Classification

dc.contributor.authorChandna, Shirin
dc.contributor.supervisorBatra, Shalini
dc.date.accessioned2010-08-30T12:45:43Z
dc.date.available2010-08-30T12:45:43Z
dc.date.issued2010-08-30T12:45:43Z
dc.descriptionM.E.en
dc.description.abstractIn this thesis, different techniques like Latent Semantic Indexing (LSI) and measures of semantic relatedness and similarity for text classification are discussed. Latent Semantic Indexing is based upon the assumption that there is an underlying semantic structure in textual data, and that the relationship between terms and documents can be re-described in this semantic structure form. The key idea of Latent Semantic Indexing (LSI) is to map documents on to a vector space of reduced dimensionality, called the latent semantic space. This mapping is done using a technique called Singular Value Decomposition (SVD). Semantic relatedness measures quantify the degree in which some words or concepts are related, considering not only similarity but any possible semantic relationship among them. In this thesis, various semantic relatedness measures that use the WordNet as their knowledge source and others MSRs like NGD and NCD which make use of the Web as their knowledge base are computed. These semantic measures are tested and their correlation with human judgement is checked.en
dc.description.sponsorshipCSEDen
dc.format.extent4154506 bytes
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://hdl.handle.net/10266/1200
dc.language.isoenen
dc.subjectSemantic Similarityen
dc.subjectSemantic Relatednessen
dc.subjectMSRsen
dc.subjectNSSen
dc.titleComparative Analysis of Measures of Similarity and Semantic Relatedness for Text Classificationen
dc.typeThesisen

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
1200.pdf
Size:
3.96 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.79 KB
Format:
Item-specific license agreed upon to submission
Description: