Skip to content

Conversation

@jlelong
Copy link
Contributor

@jlelong jlelong commented Dec 3, 2021

This PR adds a basic LaTeX support built-in and closes James-Yu/LaTeX-Workshop#2731.

Once this PR is merged, you might consider merging microsoft/vscode-markdown-tm-grammar#99 too.

@aeschli
Copy link
Contributor

aeschli commented Dec 8, 2021

@jlelong Thanks a lot!
Me and @alexr00 wondered the if by any chance we can merge some of the languages?

@jlelong
Copy link
Contributor Author

jlelong commented Dec 8, 2021

Looking closely at the files latex-language-configuration.json and latex-weave-language-configuration.json, they can be merged. For the rest, I do not think any of these languages can really be merged but we could decide that some are too specific and should not be part of the built-in extension. Here is the current situation.

  1. Standard LaTeX support needs latex-language-configuration.json, LaTeX.tmLanguage.json and TeX.tmLanguage.json. Because of an issue when embedding C++ code into LaTeX Embedding C++ code in latex is broken but C code works fine #76603 Prefer source.cpp.embedded.latex in cpp minted env James-Yu/LaTeX-Workshop#1604, we need cpp-grammar-bailout.tmLanguage.json and latex-cpp-embedded-language-configuration.json. Some LaTeX packages allows to mix Markdown and LaTeX code. This is handled by markdown-latex-combined.tmLanguage.json and markdown-latex-combined-language-configuration.json.
  2. Dynamic LaTeX with R or Julia code needs latex-wave-language-configuration.json, JLweave.tmLanguage.json and Rweave.tmLanguage.json.
  3. BibTeX support needs Bibtex.tmLanguage.json
  4. LaTeX3 support needs latex3-language-configuration.json and LaTeX-Expl3.tmLanguage.json
  5. BibTeX-Style support needs BibTeX-style.tmLanguage.json
  6. DocTeX (.dtx) files are used for creating LaTeX packages) support needs DocTeX.tmLanguage.json and doctex-language-configuration.json

Cases 4, 5 and 6 only are of interest for (very?) advanced users and may remain in LaTeX-Workshop.

The files -language-configuration have a lot in common, I do not know how to make the most of these common settings to make things simpler. The file latex-cpp-embedded-language-configuration.json is actually a copy of the C++ language configuration but we need it in https://github.com/jlelong/vscode/blob/628381f4dedf8918e5eda880d56bd2b033c1ff34/extensions/latex/package.json#L107-L109

      {
        "id": "cpp_embedded_latex",
        "configuration": "latex-cpp-embedded-language-configuration.json"
      }

Could we just refer to the C++ language configuration file here?

@jlelong
Copy link
Contributor Author

jlelong commented Dec 9, 2021

@aeschli and @alexr00

Consider extension A with

"languages": [
  {
    "id": "langA",
    "configuration": "./langA-language-configuration.json"
  }
],
"grammars": [
  {
    "language": "langA",
    "scopeName": "source.A",
    "path": "./syntaxes/A.tmLanguage.json"
  }
]
  1. Is it possible to refer to the language configuration file of langA in an other extension?
  2. Is it possible in extension A to create langB with a configuration file ./langB-language-configuration.json, which is an extended version of ./langA-language-configuration.json? Do we need to copy the content of ./langA-language-configuration.json to ./langB-language-configuration.json or is there a more elaborate way to proceed?
  3. Can we create in extension B a language langB with the same configuration file as langA but with a grammar file ./syntaxes/B.tmLanguage.json provided by extension B?

All these questions are related to your question about merging languages? Thanks a lot for your help.

@aeschli
Copy link
Contributor

aeschli commented Dec 9, 2021

Is it possible to refer to the language configuration file of langA in an other extension?

No, you can only reference a file located in your own extension.

There's no mechanism to have one JSON file extension from another one, or languages extending each other.

-> You'll have to duplicate some configurations

Two languages can point to the same language config file (or grammar). That's fine

What could be done, but I don't recommend it, is to define the language configurations in code (see the LanguageConfiguration API). We prefer the basic extensions to be declarative only (no main file). That way they work in all extension hosts.

@aeschli
Copy link
Contributor

aeschli commented Dec 9, 2021

If you feel that the duplications are error prone, we could look into a build script that creates the files. Or a script that we call on compile to validate that the rules match

@jlelong
Copy link
Contributor Author

jlelong commented Dec 9, 2021

Thanks for making all this clearer.

Before asking I had already considered relying on some scripts to create the language configuration files with a common base. I just wanted to make that there is no simpler approach. Therefore, I will go on with the scripts.

@aeschli
Copy link
Contributor

aeschli commented Dec 9, 2021

On the number of languages. Sorry for only realizing that now. It would be great if we could focus on the most popular ones.
Are rsweave and jlweave commonly used, or maybe also better covered by an extension?

@jlelong
Copy link
Contributor Author

jlelong commented Dec 9, 2021

OK, let us decide that LaTeX3, BibTeX-style and DocTeX remain in LaTeX-Workshop. The languages TeX, LaTeX and BibTeX should clearly go into the built-in extension.

For sure the languages rsweave and jlweave are less common than pure LaTeX even though there was sufficient request to implement them. Supporting these two languages boils down to only adding the short files syntaxes/JLweave.tmLanguage.json and syntaxes/RSweave.tmLanguage.json. They basically just create a superset of the LaTeX language with Julia or R languages embedding. As Julia and R basic supports are provided as built-in extensions, I think rsweave and jlweave could go into the built-in extension but obviously the choice is yours 🤔

@jlelong jlelong force-pushed the jlelong/latex-grammar branch from ce21007 to c9913f3 Compare December 10, 2021 08:11
jlelong added a commit to jlelong/vscode-latex-basics that referenced this pull request Dec 10, 2021
The support for these languages will still be provided by LaTeX-Workshop.
Related to microsoft/vscode#138395
@jlelong
Copy link
Contributor Author

jlelong commented Dec 10, 2021

I have removed the support for LaTeX3, BibTeX-style and DocTeX from vscode-latex-basics and therefore from this PR to only focus and LaTeX (include rsweave and jlweave) and BibTeX languages

@aeschli
Copy link
Contributor

aeschli commented Dec 10, 2021

Thanks a lot, @jlelong ! I'd prefer that we move rsweave and jlweave back to the extension for now.
I couldn't find many resources on these languages on the web on these languages. That makes it hard to justify why we take these and not other languages that also requested to be added as built-in languages.
Sorry for my ignorance, and really sorry for the extra work I caused by only looking at this now.

In the grammars declaration, only keep the embedded languages provided
as a built-in extension
@jlelong
Copy link
Contributor Author

jlelong commented Dec 10, 2021

I'd prefer that we move rsweave and jlweave back to the extension for now. I couldn't find many resources on these languages on the web on these languages. That makes it hard to justify why we take these and not other languages that also requested to be added as built-in languages.

Well, so be it even though the built-in extension will then not entirely solve James-Yu/LaTeX-Workshop#2983 but that is an other story.
I have edited the PR accordingly.

FYI, rsweave and jlweave are not languages per se even they appear as such in VSCode. They are LaTeX code with embedded R code (for rsweave) or Julia code for (jlweave), which is executed at compile time. The grammars rsweave and jlweave are basically just including the LaTeX grammar and the R or Julia ones.

@jlelong jlelong force-pushed the jlelong/latex-grammar branch from 249741f to 1a51707 Compare December 10, 2021 21:48
@aeschli
Copy link
Contributor

aeschli commented Dec 13, 2021

Cool, thanks a lot!
I added some colorize-test files that (hopefully) cover some of the basics of Latex/Tex and BibTex . Feel free to extend/improve the samples.
Each test file has a file with the expected tokenization and colorization.
After a grammar update, we rerun the tests (./scripts/test-integration.sh|bat). If there are differences in the tokenization or colorization the tests will fail and overwrite the expected result files with the new results. We review the changes and if they all look good and make sense, then we commit the new result as the new expected results.

@aeschli
Copy link
Contributor

aeschli commented Dec 13, 2021

@alexr00 Ok to merge?

@jlelong
Copy link
Contributor Author

jlelong commented Dec 13, 2021

Great. I will have a look at the colorize-test files and see if I add some more elaborate content based on the issues we have solved in LaTeX-Workshop.

Copy link
Member

@alexr00 alexr00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thank you! Just a couple questions about the licenses.

"suitability for any purpose.",
"",
"The file syntaxes/markdown-latex-combined.tmLanguage.json is generated from the Markdown grammar",
"included in VSCode and falls under the license described in markdown-latex-combined-license.txt.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is markdown-latex-combined-license.txt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And also cpp-bailout-license.txt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I have been a little aggressive when removing the extra languages.
I have just added them back.

Copy link
Member

@alexr00 alexr00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thank you!

@alexr00 alexr00 merged commit 65a12fa into microsoft:main Dec 15, 2021
@jlelong jlelong deleted the jlelong/latex-grammar branch December 15, 2021 20:56
@jlelong
Copy link
Contributor Author

jlelong commented Dec 25, 2021

@aeschli @alexr00
FWIW, following our discussion on how to derive a new language from an existing one (locally or remotely), I have developed a JS utility https://github.com/jlelong/vscode-extend-language published to npmjs.

@aeschli
Copy link
Contributor

aeschli commented Jan 2, 2022

@jlelong How would you use the utility?

@jlelong
Copy link
Contributor Author

jlelong commented Jan 3, 2022

@aeschli Thanks for your interest in this!

See for instance markdown-latex-combined.extension.language-configuration.json, which defines latex-markdown as an extension of latex.

The expanded file can be obtained by running

const vel = require('vscode-extend-language')
vel.expandConfigurationFile('./languages/data/markdown-latex-combined.extension.language-configuration.json', './languages/markdown-latex-combined-language-configuration.json')

Even if this utility has to be run after every update on the base file, it makes propagating changes far easier. The ultimate solution would be that VS Code does the expansion itself when loading a language-configuration.json file but it may be too complicated to implement.

Note that the base file can be remote, for instance defined in an other extension. For the time being, we rely on this utility in vscode-latex-basics and LaTeX-Workshop.

@aeschli
Copy link
Contributor

aeschli commented Jan 4, 2022

Is the idea to run the utility as a build step and then write out the result to an language-configuration.json file?
Sounds good to me.

@jlelong
Copy link
Contributor Author

jlelong commented Jan 4, 2022

Exactly. It is designed to be run as a build step.

@aeschli
Copy link
Contributor

aeschli commented Jan 4, 2022

BTW, bibtex currently has no language configuration, is that intended?

"id": "bibtex",

@jlelong
Copy link
Contributor Author

jlelong commented Jan 4, 2022

Yes, bibtex has a very basic syntax and does not need any specific language configuration.

@github-actions github-actions bot locked and limited conversation to collaborators Jan 29, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Basic language extension as a built-in extension in VS Code

3 participants