Language design #1

vyorkin · 2016-12-02T07:28:31Z

Initial thoughts

informal description

written in JavaScript
no statements, only expressions
no var (variables are always declared in a global scope by default)
no return – ShitScript returns the last evaluated expression
no floating point numbers (only integers)
no unary operators (?)
no classes, no array's, no object's (except for console and window)

shitty ideas

allow using -, ?, ! in function names (no-camel-case)
super weird type coercions (or just some non-obvious implicit coercions result in string 'shit')
; -> )))
=== -> ====, !== -> !===
(...) -> [...]
function -> shit / fuck
try -> why-the-fuck-not
catch -> fucked-up
finally -> dont-fucking-care
say please to enable lexical scoping
you can't use a couple of numbers (e.g. 4 and 2) for no reason
x / 0 = Math.random()
if -> o-rly?
then -> ya-rly
else -> no-way
o-rly?-ya-rly-no-way for one-liners (without brackets)
. -> -> (works only for console and window)

an example program:

fuck wat[] {
  calculate!!![2, 0])))
}

shit calculate!!![y, x] please {
  z = 5)))
  why-the-fuck-not {
    o-rly? z % 2 ==== 0 {
      y / x)))
    } no-way {
      x / y)))
    }
  } fucked-up[e] {
    console.lol[e])))
  } dont-fucking-care {
    0)))
  }
}

P.S.: Not sure about using words fuck and shit everywhere (may be considered offensive)

The text was updated successfully, but these errors were encountered:

ghaiklor · 2016-12-02T20:34:28Z

What ideas do you have comrades, for parser\tokens\etc ? Are we gonna to use some existing tools for writing our alphabet, lexical\semantic rules and so on ?

i.e. we can use Jison for parser.

vyorkin · 2016-12-02T20:38:07Z

looks interesting, @ghaiklor, haven't seen it before, will definitely play with it tonight.
recently I had an experience with pegjs and I believe I know how to write LL(k) parser from scratch (I'm reading Language Implementation Patters by Terence Parr book), but yes, I think its better/easier to use existing tools (DSL + parser generator) for describing our shitty formal grammar and Jison looks promising from the first sight.

chicoxyzzy · 2016-12-02T21:14:35Z

function -> shit

I like this particularly because we can have some sort of a higher order shit

vyorkin · 2017-03-06T20:11:37Z

ok, sorry for not doing anything for quite a while, I'll get back to it very soon, I hope!

ghaiklor · 2017-03-09T14:50:01Z

@vyorkin played a little bit with LLVM... What if we will take LLVM as a compiler and write LLVM frontend for our ShitScript ?

vyorkin · 2017-05-18T12:13:28Z

@ghaiklor good idea (I've just watched this talk https://www.youtube.com/watch?v=PauCAyVg348), I need to build smth very simple first (sorry still don't have enough time)

chicoxyzzy · 2017-05-18T12:17:53Z

we definitely should target LLVM so we'll be able to use emascripten to target wasm

chicoxyzzy · 2017-05-18T12:18:19Z

oops

ghaiklor · 2017-05-18T18:03:24Z

@vyorkin here is my playground for llvm, but nothing special - https://github.com/ghaiklor/llvm-kaleidoscope

vyorkin · 2017-06-23T22:25:15Z

how about using Rust + llvm-rs + lalrpop to build this? I'm going to start working on it these weekends, the time has come :)
my plan:
– build a very basic formal grammar
– generate a parser with lalrpop (we'll need to write a custom lexer & parser later for performance reasons), but it'll suffice for now
– write some tests to verify the resulting AST
– implement a visitor that will walk the AST and generate some LLVM IR
– provide a very basic REPL (to ease testing & playing with it) that will accept options like:

    -a, --ast      Parse and output AST
    -i, --llvm-ir  Build and output LLVM IR

we could use docopt or clap crates for CLI args parsing

I'm still learning & playing with llvm-rs crate (the Compile module is complex, a lot of macroses & metaprogramming stuff), but there aren't many alternatives, I've seen them all and llvm-rs seems to be the most mature, but its not under active development

ghaiklor · 2017-06-24T05:43:29Z

@vyorkin I've started R&D in parsers written in JavaScript. Found ~~good~~possible solutions we can use.

Lexical analysis - Jalex. You can describe rules via regular expressions and it will call a callback when match is found. So, we will be able to describe lexical rules via regular expressions and implement all needed actions for returning a stream of tokens.

Semantic analysis - Jison. It has its own simple built-in lexical analyzer, though, I'm thinking to use Jalex, since we will definitely write our own scanner in future.

Why I chose them? They are compatible with lex\yacc format. So you can describe definitions, translation rules in plain old-way as it was done in yacc.

For a grammar, we can try to found already implemented grammar for JavaScript and just modify it to fit our needs.

Though, still thinking about other lexical analyzers, but for semantic analysis I didn't found too much, so seems like Jison is our only options for semantic.

vyorkin · 2017-06-24T09:55:55Z

@ghaiklor do you know any good LLVM bindings for nodejs? I've found only these 2:

https://github.com/dirk/llvm2
https://github.com/kevinmehall/node-llvm
both seem outdated :(
LLVM is hard, but I've already wasted so much time learning it, I think its too late to give up on it :)

ghaiklor · 2017-06-24T13:06:55Z

@vyorkin I'm wondering why you stick to LLVM 😸
IMO, LLVM is over-engineering for our case. It's hard to support, it has a big learning curve. I understand, it will simplify code-generation phases for us, but not too much. Even, if you are going to implement it with LLVM, you still need to implement:

Parser. Could be acorn\esprima\whatever gives us a parse tree but I'm going to use some kind of parser generators like flex\bison (maybe JavaScript ports).
Semantic actions which will call LLVM IR builder. For that phase we need to implement own semantic parser or inject our own actions in tools above somehow. Or, we need a tool that will be a visitor for parse tree and will be calling LLVM IR builder. IMO, the best place to call IR builder in LLVM is semantic actions in our grammar. So we will be able to build LLVM AST during parsing, which saves to us another one iteration through parse tree.

So steps are with LLVM will be close to defining a scanner with rules which returns tokens with inherited and synthetic attributes. Passing these tokens into a parser which has our grammar with semantic actions. During parsing of our tokens, parser will be able to call our semantic action where we are calling LLVM IR Builder. And, do not forgot about code-generation phase which we also need to implement with LLVM.

Anyway, we'll not get magical solution for ShitScript if we are stick to LLVM.

My initial idea is to examine existing generators for lexical and semantic parsers, so we can build our own grammars right from scratch and use generators to create parsers. Afterwards, I'm looking for a way to create our own code generator. Still thinking about it, but if we will have a grammar and a parse tree, that's not a big issue to generate code in SSA form. Aaaand, when we have SSA form, that's not a big issue to generate an Assembly code from it. To be honest, I even think about generating machine code from JavaScript, but that's just thoughts.

What you all think? @vyorkin @chicoxyzzy maybe and @bniwredyc

vyorkin · 2017-06-24T17:40:57Z

Wow, thanks! I'll give a detailed answer today later, here is my latest unfinished playground in rust which I've started to work on after working through LLVM kaleidoscope tutorial series (same thing as you did, but I'm still not finished it yet:)). I've stopped here (LLVM IR Builder / Emitter visitor).

UPD:
I'm not sure about LLVM, but its very appealing: we get various backends (e.g. emscripten can be used to target WASM) and optimizations (traditional SSA-based, CFG-based, inteprocedural analysis & transformations) for free, JIT and a lot of other stuff. In addition, this is a very valuable experience that can be useful in the future to build something real. But the learning curve is high and I'm not sure if its worth the time wasted (and I've already spent too much).

ghaiklor · 2017-06-25T11:06:54Z

@vyorkin

but its very appealing: we get various backends (e.g. emscripten can be used to target WASM) and optimizations (traditional SSA-based, CFG-based, inteprocedural analysis & transformations) for free, JIT and a lot of other stuff

Agreed, though, you still need to implement the correct way of applying these optimizations.

We are creating a ShitScript here, do not forgot about it. And the question here is does it worth it to investigate so much time in LLVM for building a ShitScript ? 😸
May be, a language just with stupid code generation without optimization will be as a point why it's called ShitScript, you know...

ghaiklor · 2017-06-25T12:40:20Z

@vyorkin also, I've just found LLVM compiled to JavaScript itself - https://github.com/kripken/llvm.js
Based on the demo, it looks like we will be able to compile LLVM bytecode via JavaScript.

I.e.

// Here input is an LLVM IR
function process(input) {
  try {
    return llvmDis(llvmAs(input));
  } catch (e) {
    if (typeof e == 'string') {
      return 'Error in compilation: ' + e;
    } else {
      throw e;
    }
  }
}

Worth note that it's just a playground and as author mentioned:

This demo was done as a fun hacking project over a holiday vacation, so there are some caveats: The generated code is not optimized at all, so benchmarking is pointless; if you want to benchmark, run emscripten normally with -O2. Compilation speed has also not been optimized at all. Also, this demo has hardly been tested and glues together several codebases in ways they were not originally intended, there might be things that do not work.

chicoxyzzy · 2017-06-25T23:45:57Z

Sorry I'm too drunk for this kind of shit RN

vyorkin added the help wanted label Dec 2, 2016

chicoxyzzy closed this as completed May 18, 2017

chicoxyzzy reopened this May 18, 2017

mortemale mentioned this issue Mar 11, 2023

Add, translate, edit collaborative book with short stories about V.V. Putin #2

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Language design #1

Language design #1

vyorkin commented Dec 2, 2016 •

edited

Loading

ghaiklor commented Dec 2, 2016

vyorkin commented Dec 2, 2016 •

edited

Loading

chicoxyzzy commented Dec 2, 2016

vyorkin commented Mar 6, 2017

ghaiklor commented Mar 9, 2017

vyorkin commented May 18, 2017 •

edited

Loading

chicoxyzzy commented May 18, 2017

chicoxyzzy commented May 18, 2017

ghaiklor commented May 18, 2017

vyorkin commented Jun 23, 2017 •

edited

Loading

ghaiklor commented Jun 24, 2017

vyorkin commented Jun 24, 2017 •

edited

Loading

ghaiklor commented Jun 24, 2017 •

edited

Loading

vyorkin commented Jun 24, 2017 •

edited

Loading

ghaiklor commented Jun 25, 2017 •

edited

Loading

ghaiklor commented Jun 25, 2017

chicoxyzzy commented Jun 25, 2017 •

edited

Loading

Language design #1

Language design #1

Comments

vyorkin commented Dec 2, 2016 • edited Loading

Initial thoughts

informal description

shitty ideas

ghaiklor commented Dec 2, 2016

vyorkin commented Dec 2, 2016 • edited Loading

chicoxyzzy commented Dec 2, 2016

vyorkin commented Mar 6, 2017

ghaiklor commented Mar 9, 2017

vyorkin commented May 18, 2017 • edited Loading

chicoxyzzy commented May 18, 2017

chicoxyzzy commented May 18, 2017

ghaiklor commented May 18, 2017

vyorkin commented Jun 23, 2017 • edited Loading

ghaiklor commented Jun 24, 2017

vyorkin commented Jun 24, 2017 • edited Loading

ghaiklor commented Jun 24, 2017 • edited Loading

vyorkin commented Jun 24, 2017 • edited Loading

ghaiklor commented Jun 25, 2017 • edited Loading

ghaiklor commented Jun 25, 2017

chicoxyzzy commented Jun 25, 2017 • edited Loading

vyorkin commented Dec 2, 2016 •

edited

Loading

vyorkin commented Dec 2, 2016 •

edited

Loading

vyorkin commented May 18, 2017 •

edited

Loading

vyorkin commented Jun 23, 2017 •

edited

Loading

vyorkin commented Jun 24, 2017 •

edited

Loading

ghaiklor commented Jun 24, 2017 •

edited

Loading

vyorkin commented Jun 24, 2017 •

edited

Loading

ghaiklor commented Jun 25, 2017 •

edited

Loading

chicoxyzzy commented Jun 25, 2017 •

edited

Loading