Skip to content

N-Quads parsing error during normalization  #10

@br41nlet

Description

@br41nlet

Overview

I've ran into an issue trying to normalize N-Quads using URDNA2015 normalization from jsonld library.

pyld.jsonld.JsonLdError: ('Could not convert input to RDF dataset before normalization.',)
Type: jsonld.NormalizeError
Cause: ('Error while parsing N-Quads invalid quad.',)
Type: jsonld.ParseError

Details

In dkg.js we're normalizing N-Quads using the following function:

async toNQuads(content, inputFormat) {
    const options = {
        algorithm: 'URDNA2015',
        format: 'application/n-quads',
    };

    if (inputFormat) {
        options.inputFormat = inputFormat;
    }

    const canonized = await jsonld.canonize(content, options);

    return canonized.split('\n').filter((x) => x !== '');
}

I've tried to reproduce the same logic in dkg.py, but I've ran into issues trying to normalized N-Quads (JSON-LD works fine). It may be either wrong usage of the library from my side or bug in the jsonld as it seems it's not supported anymore.

Python normalization function:

def normalize_dataset(
    dataset: JSONLD | NQuads,
    input_format: Literal["JSON-LD", "N-Quads"] = "JSON-LD",
) -> NQuads:
    normalization_options = {
        "algorithm": "URDNA2015",
        "format": "application/n-quads",
    }

    match input_format.lower():
        case "json-ld" | "jsonld":
            pass
        case "n-quads" | "nquads":
            normalization_options["inputFormat"] = "application/n-quads"
        case _:
            raise DatasetInputFormatNotSupported(
                f"Dataset input format isn't supported: {input_format}. "
                "Supported formats: JSON-LD / N-Quads."
            )

    n_quads = jsonld.normalize(dataset, normalization_options)
    assertion = [quad for quad in n_quads.split("\n") if quad]

    if not assertion:
        raise InvalidDataset("Invalid dataset, no quads were extracted.")

    return assertion

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions