Last modified on 20 June 2013, at 02:10

User:AugPi/Lojban

DigraphsEdit

NumbersEdit

DigitsEdit

  • 0 : no
  • 1 : pa
  • 2 : re
  • 3 : ci
  • 4 : vo
  • 5 : mu
  • 6 : xa
  • 7 : ze
  • 8 : bi
  • 9 : so

Gismu acting as either selbri or sumtiEdit

Examples:

  • pilno
  • bloti
  • barda ("sumti modifier", corresponding to English adjective)
  • sutra ("selbri modifier", corresponding to English adverb)

Grammatical termsEdit

  • selbri : predicate
  • sumti : argument
  • bridi : predication (predicate + argument(s))
  • cmavo : structural word
  • brivla : content word
    • morphologically there are three kinds of these: gismu, lujvo, and fu'ivla.
    • functionally, these act in either of two ways: as selbri or as sumti (this latter case only when modified by certain gadri).
  • tanru : a compound brivla
    • seltau : first component of a tanru (cf. LISP's CAR)
    • tertau : second component of a tanru (cf. LISP's CDR)
  • lujvo : a "fossilized" tanru (made up of rafsi)
  • fu'ivla : a word borrowed from another language, prepended with a semantic disambiguation tag
  • cmene : a "name", i.e., proper noun
  • gismu : a root (content) word
  • rafsi : an abbreviated form of a gismu (useful especially for forming lujvo)
  • place structure (of a selbri) : the selbri's definition (in terms of parameters x1, x2, etc.); Lojban's version of case frame.
    • Place structure inheritance: e.g., tumxra inherits its place structure from its component pixra.
      • But there may be "pruning": e.g., trutca from tcadu.
      • But... zgipli inherits sumti slots from both of its components. Each sumti slot of a lujvo would be inherited from (and equated to) some sumti slot of either one or the other one of its components.
  • observative : a bridi without an x1 sumti (or is it just a bridi without any sumti?)
  • gadri : articles
  • conversion : swapping of a selbri's first sumti with another one of that selbri's sumti.
  • sumti tcita : preposition (for an extra, "labeled sumti", place in a selbri's place structure); the preposition is the "modal" form of some gismu which determines the preposition's meaning. (example: fi'e)
    • if the labeled sumti is an internal sumti, then the sumti tcita is preceded by be.
      • if the internal sumti modifies a cmene then use pe instead of be.
  • internal sumti : ≈ relative clause prepositional phrase
    • begun with be.
  • MEX : mathematical expression
  • abstraction : a kind of subordinate clause
  • abstractor : transforms a selbri (right before it becomes sumtified)
    • rough analogies: Laplace transform, wavelet transform (DHWT, ...), ...
  • descriptor : a gadri
  • description : a noun phrase

InvestigandaEdit

Tanru vs. lujvoEdit

Rough idea:

  • tanru : two (or more) brivla written separately but acting as a single sumti or a single selbri
  • lujvo : two (or more?) rafsi fused into a single brivla

Lujvo cmeneEdit

  • Would ritygu'e be a lujvo or a cmene (or both)?
    • If it is a lujvo, then what would be its place structure (cmene don't have place structure): would x2 only be fillable with brito or could any arbitrary subset of brito also do (because if it is fixed to brito then that would obviate the need for x2).
    • Perhaps more to the point, wouldn't the x1 sumti of ritygu'e have to be ritygu'e itself? Also, in ritygu'e, ritygu'e is modified with la, which should modify cmene, so it looks like ritygu'e should be classified as cmene even though it is morphologically more of a lujvo (e.g., it does not end in consonant and full stop like cmene usually do, but ends with vowel, rather anomalously): that is, when form (morphology) clashes with function (part of speech), go with the function; as in the adage «form follows function».
      • See http://jbovlaste.lojban.org/dict/ritygu%27e : according to its definition, sumti x2 and those following it are obviated, x1 remains. Someone in ritygu'e could spread the arms and say: "Ti ritygu'e" and it would make sense, especially if addressing visitors.
      • http://www.mail-archive.com/lojban-beginners@lojban.org/msg05849.html (tijlan.) recommends the label "Lujvo cmene": ritygu'e would be (strictly speaking) a lujvo which becomes a cmene only when modified with la, but then again, cmene are only cmene when modified with la (otherwise they are just cmevla, but how would a non-cmene cmevla actually function in a Lojban sentence?). Anyway, since there are already POS headers such as "Proper noun" and "Compound cmavo" being used, then there could be a "Lujvo cmene" POS header and category as well...

sumti tcitaEdit

  • ri'a is a sumti tcita so should that mean that it should act more like a preposition ("because of") rather than a conjunction ("because")?
    • If acting (apparently) as conjunction it is probably followed immediately by gadri + abstractor, such as lenu, so the actual conjunction would be, say, ri'alenu, rather than just ri'a.
    • because = ki'ulenu/ri'alenu/&c./ki'ulonu/ri'alonu/&c.; because of = ki'u/ri'a/&c.

Lojban ELEEdit

The following POS headers are currently admissible:

  1. Cmavo
  2. Gismu
    • This is a closed category, and WT already appears to have entries for all of Lojban's gismu (1342 of them).
  3. Rafsi
  4. Brivla
  5. Proper noun

NotesEdit

  • The first three are closed categories: they are "hardwired" into the Lojban baseline. Any such words should be automatically includable (without second thought) in WT, since they are, by definition (not to say by LLG's fiat), part of Lojban.
  • The last two are open categories: new Lojban words in these categories could be coined ad hoc, so think twice and proceed with caution before adding any such terms to WT as such words might have trouble meeting the requirements of CFI.

Online Resources & ReferencesEdit

TutorialsEdit

GrammarsEdit

GismuEdit

RafsiEdit

CmavoEdit

Selma'oEdit

Online Parser/Translator (jboski)Edit

Another oneEdit

Offline Parser (runs on Terminal)Edit

How to set up and run:

Offline Parser (runs on a web browser, through JavaScript)Edit

How to set up and run:

  • Go to http://mhagiwara.github.io/camxes.js/
  • Save the web page as an HTML file.
    • By the above action, a new folder should also be automatically created near that new HTML file, which contains the file camxes.js
  • Open that HTML file in a web browser (that can run JavaScript).
  • Type a lojbanic expression inside the Demo textbox. (Its parsing should end up being shown under it, if it has one; otherwise it should show an error message.)

TextsEdit

Lojban formal grammarEdit

!   ' |\t' ;
at the top: it defines whitespace! Do not use any whitespace in any regexes (in the lex part) unless they are actually part of the Lojban terms. Replace the error 's in the "gaps" with nothing (∅, ε, i.e., whitespace). Paste code into the big text box in the middle, click on Build (the window blows up widthwise), then click on Run. That should implement Step 6 of the parser. (Lexical tokens, if any, should be added manually, or through some pre-parsing code.). A neat parse tree should be generated on the bottom right corner as well as action and goto tables for the LALR(1) parser (consisting of 894 states!) on the under-middle left side.