Skip to content

Conversation

@tpeng
Copy link
Contributor

@tpeng tpeng commented Mar 8, 2014

add CRFsuite backend base on the crfsuite's SWIG python package.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why isn't warning issued here if parsing of bigram features is not supported?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since label bigram is supported by CRFsuite.

@kmike
Copy link
Member

kmike commented Mar 12, 2014

Hey @tpeng,

CRFsuite support is great!
Can you please clarify some points? I haven't used CRFsuite SWIG wrapper, but CRFsuite supports variable number of features, so

  1. why learn feature_names_?
  2. why use {'token': ('tag', 1.0)} interface instead of {'token:tag': 1.0}?
  3. what is CRF++ templates support needed for?

@tpeng
Copy link
Contributor Author

tpeng commented Mar 18, 2014

Hi @kmike,

they are all related. for 1./3. we use CRF++ template to expand the features with feature_names_. expanding with feature functions in CRFsuite is ok too, but i found with CRF++ template is convenient.

for 2. since feature_names_ are like {'token': xx} not {'token:tag': 'xx'}, that's why we former one.

add a default features set for CRFsuite which demostrate how
to create bigram/trigram features with python rather than template.
@tpeng
Copy link
Contributor Author

tpeng commented Mar 18, 2014

as discussed with @kmike , we decided abandon CRF++ template support. this way we can use the variable features which only supported by CRFsuite.

numeric features also supported if the feature function return a tuple like (value, scale)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a lot of code duplication. I think this function should call num_pattern, or they both should call another function.

@tpeng tpeng changed the title CRFsuite backend add base classifier and global ngrams feature functions Apr 22, 2014
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a good optional feature.

kmike added a commit that referenced this pull request Apr 22, 2014
add base classifier and global ngrams feature functions
@kmike kmike merged commit e3defff into scrapinghub:master Apr 22, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants