Qualcomm Open Source Technology Group License Identifier tool scans source code and identifies the license and the license text region using known license templates. This tool utilizes a bag of the words approach. Instead of using just one word at a time (unigram), it uses bigram and trigram as one “word”. Then, a distribution of such unigram, bigram and trigram is computed, and is used for detecting the license type. When detecting license text regions, this tool employees edit distance metrics to find the optimal start and end position of the identified license text.