bg.bas.dcl.CorpusTools
Class CorpusMWETagger
java.lang.Object
bg.bas.dcl.CorpusTools.CorpusMWETagger
public class CorpusMWETagger
- extends java.lang.Object
Methods inherited from class java.lang.Object |
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
CorpusMWETagger
public CorpusMWETagger()
main
public static void main(java.lang.String[] args)
- Parameters:
args
-
readMWEDictFromFile
public static void readMWEDictFromFile(java.lang.String path)
getFileListing
public static java.util.List<java.io.File> getFileListing(java.io.File aStartingDir)
throws java.io.FileNotFoundException
- Recursively walk a directory tree and return a List of all
Files found; the List is sorted using File.compareTo().
- Parameters:
aStartingDir
- is a valid directory, which can be read.
- Throws:
java.io.FileNotFoundException
indexTexts
public static void indexTexts(java.lang.String path)
correctChunks
public static void correctChunks(java.lang.String path)
extractMWECandidates
public static void extractMWECandidates(java.lang.String path)
extractMWECandidatesPlusFreq
public static void extractMWECandidatesPlusFreq(java.lang.String path)
tagFromDictionary
public static void tagFromDictionary(java.lang.String path)
markMWEsInFiles
public static void markMWEsInFiles(java.lang.String path)
findMWEs
public static void findMWEs(java.lang.String path)