Feb. 13, 2024, 10:40 a.m.

Introduction to Similar Plus

Introduction:

We extended the similarity analysis to support more features that include clipboard data comparison, and show the similarity result visually side by side. Our proprietary PDF processor written by C++ and allowed local text content in pdf extraction.

This article introduces the Similarity Analysis Plus application which supports in Android, iOS, MacOS and Windows OS. The introduction on basic application in here: Introduction to similarity basic

Guided illustration:

The following diagrams illustrate user interface design.

 

There are three navigation tabs in this application. In the “Main” tab is for set the source and compare files. The “Clipboard” tab is for clipboard input and the “Analysis” tab is for side by side comparison. There are two “Browser” button allows to browser (pick up) file from the location device (mobile or PC). The “Process similarity analysis” button allows comparison between source file and compare file. In the case of either source file or compare file is empty, the “Process similarity analysis” button is disabled (grey in color). Both source file and compare file is input properly, the “Process similarity analysis” button will be changed to Light Orange color.

The “Editor” tab page

Figure 1. This is the main page of Similarity analysis plus.

 

When both source file and compare file is input properly, the “Process similarity analysis” button will be changed to Light Orange color. Please be patient to wait for the result. The internal timeout is 10 seconds.

The “process the similarity analysis” enabled

Figure 2. Ready to process the similarity analysis.

 

Explanation of Similarity analysis result as follows:

  1. Similarity unique count is a first come first serve algorithm that meant first match content will be treated as unique count.
  2. Similarity overlap count is a content match in somewhere repeatedly. The overlap percentage more than 100% that meant some patterns (or word or sentence) are more than one occurrence.
  3. When overlap percentage more than 100%, the similarity percentage more than 50% meant that the content is most likely highly similar and the content in somewhere are repeated.
Similarity analysis result

Figure 3. The similarity analysis result of source file and compare file.

In the “Analysis” tab page. The content compares side by side, left and right split screen to show the original content.

  1. The “First” button is jump to the first similar content (at the beginning).
  2. The “Previous” button is back to the previous similar.
  3. The “Next” button is going to the next similar.
  4. The “Last” button is jump to the last similar content (at the end of file).
The “compare result side by side”

Figure 4. The similarity analysis compares result of source file and compare file side by side.

 

Same position if the content match or content contains the value. When there is a same (or similar) content, the color will be same in both sides. In the content cannot find any similarity, the color of content is different, blue grey in left hand side and pale grey in right hand side.

The “compare result side by side”

Figure 5. The similarity analysis content at same position.

 

Save the PDF to text file. This application allows to extract the pdf file content to text file and then save to the extraction into text file.

The menu to select save pdf to text

Figure 6. The similarity analysis save pdf to text in menu.

 

Summary:

We provide quick similarity analysis for two input files (pdf or text), also support clipboard input. The application supports save the pdf content to text file, visually compare the content side by side.

Support and contact:

Please send email to support@thinkwider.co for any inquiries and support. Thanks a lot.

Download:

Preparing