Skip to content

Using Stanford Sentiment Treebank #2

@anujkumar9

Description

@anujkumar9

Hello,

In the code, it looks like the files that need to be read are "vocab-cased.txt", "trainsents.txt", "testsents.txt", etc (see below). However, when I download the dataset from http://nlp.stanford.edu/sentiment/index.html (three links on the right hand side), I don't see those files in any of the downloads. Can you kindly point me to where I need to pick the correct set of files? Thanks a ton.

`def load_sentiment_treebank(data_dir,fine_grained):
voc=Vocab(os.path.join(data_dir,'vocab-cased.txt'))

split_paths={}
for split in ['train','test','dev']:
    split_paths[split]=os.path.join(data_dir,split)

fnlist=[tNode.encodetokens,tNode.relabel]
arglist=[voc.encode,fine_grained]
#fnlist,arglist=[tNode.relabel],[fine_grained]

data={}
for split,path in split_paths.iteritems():
    sentencepath=os.path.join(path,'sents.txt')
    treepath=os.path.join(path,'parents.txt')
    labelpath=os.path.join(path,'labels.txt')
    trees=parse_trees(sentencepath,treepath,labelpath)
    if not fine_grained:
        trees=[tree for tree in trees if tree.label != 0]
    trees = [(processTree(tree,fnlist,arglist),tree.label) for tree in trees]
    data[split]=trees

return data,voc`

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions