Please use this identifier to cite or link to this item: http://localhost:8080/xmlui/handle/123456789/4112
Full metadata record
DC FieldValueLanguage
dc.contributor.authorDakshinamoorthy, Meenakshi-
dc.contributor.authorAbdul Rahim, Mohamed Shanavas-
dc.date.accessioned2024-05-30T17:51:22Z-
dc.date.available2024-05-30T17:51:22Z-
dc.date.issued2024-05-30-
dc.identifier.issn2302-9285-
dc.identifier.urihttp://localhost:8080/xmlui/handle/123456789/4112-
dc.description.abstractAvailability of large data storage systems has resulted in digitization of information. Question and answering communities like Quora and stack overflow take advantage of such systems to provide information to users. However, as the amount of information stored gets larger, it becomes difficult to keep track of the existing information, especially information duplication. This work presents a similarity detection technique that can be used to identify similarity levels in textual data based on the context in which the information was provided. This work presents the transformer based contextual similarity detection (TCSD), which uses a combination of bidirectional encoder representations from transformers (BERT) and similarity metrics to derive features from the data. The derived features are used to train the ensemble model for similarity detection. Experiments were performed using the Quora question similarity data set. Results and comparisons indicate that the proposed model exhibits similarity detection with an accuracy of 92.5%, representing high efficiencyen_US
dc.language.isoenen_US
dc.publisherBharathidasan Universityen_US
dc.subjectBagging BERT Contextual text analysis Ensemble modelling Similarity detection Transformersen_US
dc.titleTransformer induced enhanced feature engineering for contextual similarity detection in texten_US
dc.typeArticleen_US
Appears in Collections:Department of Mathematics

Files in This Item:
File Description SizeFormat 
3284-9921-1-PB.pdf431.43 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.