Polymorphic Short Tandem Repeats Finder

About pSTR

pSTR Finder (pSTR) is a .NET application that can be installed as a stand-alone Windows 64-bit desktop application or deployed as a web application. At the present time only the web version is available for non-commercial use.

pSTR is designed to efficiently analyse the STR sequences from multiple genome sequence samples generated using any feasible STR finder applications. At the present time we are supporting Tandem Repeats Finder (Gary Benson, 1999) to generate the STR sequences. Users are encouraged to familiarize themselves with the specific application in order to adequately generate user desired STR sequences and use them as the input. While the user interface is intuitive for most users, it is possible to download and run either the desktop or command line version of TRF using the desired specific options and then submit results to pSTR. Please refer to the specific help for more info.

System and Methods

Please note that we require one sample FASTA file containing only one FASTA header line and one sequence stream. User is expected to submit multiple sample FASTA files and then designate one sample file as the sample reference. User needs to enter all required parameters before submitting the request to process. Refer to Tandem Repeats Finder (Gary Benson, 1999) for more info to use the app.

At the end of the process multiple reports will be generated and saved in CSV (comma separated values) format.

  1. A Summary Report shows the number of identical STRs, the number of polymorphic STRs, and the number of different STRs between two matching samples. The Summary Report also details the total number of identical STR loci, and polymorphic STR loci, and unique STR loci for all samples analyzed, based on the specific reference sample.
  2. Detail Report: this report relates to the specified reference sample. The following are recorded in the CSV file: the 5’ and 3’ flanking sequences; the number of bases constituting the repeat unit; the sequence of the STR motif; the position of the STR starting from the first base in the repeat; the variation in number of repeat units based on the samples included and with the same 5' & 3' flanking sequences; the number of repeats for the reference sample; and the number of repeats for every other sample included. Excluded in this Detail Report are STR sequence records that are at the same putative STR locus but are recorded as having a shorter repeat motif, and STR sequence records having the same 5' and 3' flanking sequences and smaller repeat number. These two types of STR sequence are excluded from the Detail Report are saved for each sample in the Duplicated STR Report.
  3. Duplicated STR Report: a separate report for each sample captures the "duplicated" STR records mentioned above.
  4. Identical STR Report: this report captures all "identical" STR sequence records from the comparison of two samples. An "identical" STR means that both the 5' & 3' flanking sequences and the repeat number are the same between both matching STR loci within the two input samples.
  5. Polymorphic STR Report: this report captures all STR sequence records having the same 5' & 3' flanking sequences but a different repeat number between two input samples.
  6. Different STR Report: this report, like 4 & 5 above, requires two input samples and captures all STR sequence records that exist in the source sample only and not in the other sequence file.
  7. Different STR Report after switching samples: this report also captures all "different" STR sequence records but the "source" and "target" sample are switched and then re-matched.

Performance

The performance of pSTR was tested using TRF based on four sets of sequences of human chromosome X (AC_000155_1, CM000685.1, NC_018934_2, and CM000274.1) obtained from GenBank. There are 143,733,266 bp, 155,270,560 bp, 155,181,468 bp, and 155,181,468 bp in the respective FASTA source sample.

The TRF options selected/entered to search all 4 Chromosome X samples are below:

  1. Alignment Score for match, mismatch, indel: 2, 7, 7
  2. Minimum Alignment Score: 50
  3. Maximum Repeat Unit Size: 6
  4. 5’ and 3’ Flanking Sequence Size: 10

Click here to download the performance test sample data.
Click here to download the performance test reports generated by TRF.

Account Registration

Users are required to create a free account with us. User's e-mail address will be verified during the registration verification process. All e-mail addresses will be kept confidential and will only be used to track requests submitted by the specific users. We may use e-mail address to notify a user if we found a specific request cannot be completed for any reasons. The application will become accessible once user completes the registration process and logs in to the site. Click here to create a new account.

pSTR Limitation

Depends on the size of the contig sequences submitted by a user and the speed of user's internet, the upload time may vary from a few seconds to hours. pSTR limits user's overall upload size to 2GB for the duration of 5 hours or less. We will monitor our server and make necessary adjustment on this limitation.

Please note that the pSTR matching process and the optional STR generation process using the specific STR finder tool may be very resource consuming and time consuming, we limit all users to submit one request at a time and user will need to wait for the current pSTR process completes first before submitting the next request. User will notice that the Submit button changed to Check Status right after a request was submitted. Users may click on the Check Status button to find out the status of the pSTR matching and report generation.

pSTR will compress the final reports and place them under a personalized folder for user to download. We encourage all users to download their reports as soon as possible since our storage on the server is limited. We may remove users' upload and download files periodically without notifying users. We thank you for your understanding in advance.

Browser Support

pSTR is designed to support HTML5 and CSS3 mainly for multiple files upload functionality. At this point Internet Explorer is not supported due to the lack of multiple files upload support. We'll continue finding ways to support IE as soon as a feasible solution becomes available.

Final Remark

As a final note, it is a common knowledge that applications often need to be revised in order to become more stable and useful. We intend to make this application better and truly useful therefore we appreciate your notification for any issues you've encountered.

Contact Us

Please send e-mail to jimlee@ntu.edu.tw if you have any technical related questions or comments or if you encounter a programming related bug. We'll do our best to get back to you on a timely maner. Thank you.