Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wip] add settings-based code extractors #1080

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

bollwyvl
Copy link
Collaborator

@bollwyvl bollwyvl commented Apr 22, 2024

References

Code changes

  • adds cellTypes to RegExpForeignCodeExtractor.IOptions
  • adds schema/transclusions.json to define a simplified extractor
  • adds ILSPCustomTransclusionsManager token

User-facing changes

  • users will be able to add new extractors from the Settings Editor

Backwards-incompatible changes

  • n/a

Chores

  • linted
  • tested
  • documented
  • changelog entry

@bollwyvl bollwyvl changed the title [wip add settings-based code extractors [wip] add settings-based code extractors Apr 22, 2024
Copy link

Binder 👈 Launch a binder notebook on branch bollwyvl/jupyterlab-lsp/gh-1079-settings-code-extractors

@bollwyvl
Copy link
Collaborator Author

As of opening this PR, things kinda work, e.g. this naive extractor finding <script> tags inside .md files and markdown cells:

{
  "enabled": true,
  "codeExtractors": {
    "md-js-script": {
      "hostLanguage": "ipythongfm",
      "foreignLanguage": "javascript"
      "pattern": "<script>((.|\n)*?(?=</script>))</script>",
      "foreignCaptureGroups": [1],
      "fileExtension": "js",
      "cellTypes": ["code", "markdown"],
      "isStandalone": false,
    }
  }
}

Kinda, in that while diagnostics work, completion, go to definition, and others do not appear to work, but i haven't debugged further.

The internal state of the upstream manager (and the extractor itself) is kind of hard to reason about, and generally requires a full page reload after making any changes.

@bollwyvl
Copy link
Collaborator Author

image

},
"foreignCaptureGroups": {
"type": "array",
"description": "Array of numbers specifying match groups to be extracted from the regular expression match, for the use in virtual document of the foreign language",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently we aren't limited to integer-indexed groups:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Named_capturing_group#browser_compatibility

The syntax, alas, is not portable:

# python
r"(?P<name>.*)"
/* js */
/(?<name>.*)/ 

But still...

"properties": {
"pattern": {
"title": "Pattern",
"type": "string",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should likely consider allowing this to be an array of strings that get pre-concatenated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Define custom code extractors in settings
1 participant