Detecting Domain From Source Code Using Semantic Clustering

dc.contributor.authorMadan, Sanjay
dc.contributor.supervisorBatra, Shalini
dc.date.accessioned2009-08-12T08:20:35Z
dc.date.available2009-08-12T08:20:35Z
dc.date.issued2009-08-12T08:20:35Z
dc.description.abstractTo understand the software source code lots of approaches have been developed and many of them concern to the program structural information but this results in the loss of domain semantic crucial information contained in the text or symbols of source code. To understand software as a whole, we need to enrich these approaches with conceptual insights gained from the domain semantics. This thesis proposes the mapping of domain to the code using the information retrieval techniques to use linguistic information, such as identifier names and comments in source code. Here we introduce an algorithm based on the concept of Semantic Clustering to group source artifacts based on how the synonymy and polysemy is related. The algorithm uses the concept of Latent Semantic Indexing (LSI). The biggest advantage of the approach used is that it works at the source code textual level thus making it language independent. It correlates the semantics with structural information applies at different levels of abstraction (e.g. packages, classes, methods). After detecting the clusters, based on semantic similarity automatic labeling of the program code is done and is visually explored. Since 3-Dimension visualization makes the concept detection much easier, we have concentrated on visualization of semantic clusters detected in the source code.en
dc.format.extent8473301 bytes
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://hdl.handle.net/10266/867
dc.language.isoenen
dc.subjectsemantic clusteringen
dc.subjectLSIen
dc.subjectsemanticsen
dc.subjectclusteringen
dc.titleDetecting Domain From Source Code Using Semantic Clusteringen
dc.typeThesisen

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
867 Sanjay Madan (80732016).pdf
Size:
2.87 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.79 KB
Format:
Item-specific license agreed upon to submission
Description: