Post
Fri Dec 05, 2014 4:41 pm
#16
Re: Imaginary Alphabets
Are you volunteering, Juggler? Because I get the impression you'd do an excellent job of it
Doesn't have to be fully procedural. But agreed, otherwiseCornflakes_91 wrote:Well, creating one manually and generating them procedurally are different kinds of work, though
I've actually never made a conlang. I know, I know...I'm totally losing my Language GeekTM card for that.Scytale wrote:Are you volunteering, Juggler? Because I get the impression you'd do an excellent job of it
Oh my god, I did this too! It looked more like Arabic though, convincingly enough that several people thought I had converted to Islam. (I still find Arabic to be one of the prettiest looking languages... if only it sounded nearly as pretty). The symbols were based on math, but it was still just plain english; the vowels were exponential curves, the consonants were angles or repeating lines softly written into each other. I actually employed this language occasionally for cheating on tests to useful effectTheJuggler wrote:When I got bored in highschool i developed a habit of designing writing systems. I never got into conlanging, though, so they were all for english.
People jump to such wild conclusions sometimesHyperion wrote:Oh my god, I did this too! It looked more like Arabic though, convincingly enough that several people thought I had converted to Islam. (I still find Arabic to be one of the prettiest looking languages... if only it sounded nearly as pretty). The symbols were based on math, but it was still just plain english; the vowels were exponential curves, the consonants were angles or repeating lines softly written into each other. I actually employed this language occasionally for cheating on tests to useful effectTheJuggler wrote:When I got bored in highschool i developed a habit of designing writing systems. I never got into conlanging, though, so they were all for english.
mazerRackham wrote:An entire procedural language could be created fairly easily. Well that's the understatement of the year, but the mechanics of making it wouldn't be all that hard. The steps as I see them are as follows. (Feel free to modify/enhance the steps as you see fit).
1. Create an alphabet using a variable number of characters in the character set.
I would personally use the IPA as a base for all the sounds that can be made. This is fortunately a finite list. This allows one to create a language that is actually pronounceable. You would have to find a proper balance between consonants and vowels in order for this to be practical. Assign each sound to a number (much like we have assigned each letter of the alphabet to an ASCII number).
After assigning the sounds of our new language number values we can make the glyph set, which most people would define as the "fun" part... (Don't forget punctuation!!!)
2. Create words.
This would be the hard part, but in my mind the "fun" part. I would take a huge language data set, say all of the English Wikipedia, and use something like this ->http://en.wikipedia.org/wiki/Tf%E2%80%93idf in order to find the most "important" words. That's the easy part. The hard part would be to define the parts of speech i.e. nouns, verbs, adjectives, etc.
After we find the "important" words comes the simple part of creating a large group of words using our new alphabet according to rules. These rules would once again be procedurally generated but some things are just not pronounceable which would create the bounds for the new words. A rule would be defined by regular expressions. With these new rules in place we could create new words all day, but we only have to create words for the "important" words that we have already found with the possibility of synonyms.
3. Create the grammar.
This would also be defined by regular expressions. We would define what a proper sentence would be, i.e. a noun first followed by and optional adjective and verb followed by an optional adverb followed by an effectual coma and a period. Or something like that. Define all the ways a sentence could be formed and then we are off the the races. We can now piece together all the parts of the language in order to create full texts that could be deciphered in game.
an alien language doesnt have to be pronounceable by humansmazerRackham wrote: 1. Create an alphabet using a variable number of characters in the character set.
I would personally use the IPA as a base for all the sounds that can be made. This is fortunately a finite list. This allows one to create a language that is actually pronounceable. You would have to find a proper balance between consonants and vowels in order for this to be practical. Assign each sound to a number (much like we have assigned each letter of the alphabet to an ASCII number).
After assigning the sounds of our new language number values we can make the glyph set, which most people would define as the "fun" part... (Don't forget punctuation!!!)
you'd only create a variation of the language you are analysing, that values are not uniform over the human language space, not to speak about alien languages.mazerRackham wrote: 2. Create words.
This would be the hard part, but in my mind the "fun" part. I would take a huge language data set, say all of the English Wikipedia, and use something like this ->http://en.wikipedia.org/wiki/Tf%E2%80%93idf in order to find the most "important" words. That's the easy part. The hard part would be to define the parts of speech i.e. nouns, verbs, adjectives, etc.
After we find the "important" words comes the simple part of creating a large group of words using our new alphabet according to rules. These rules would once again be procedurally generated but some things are just not pronounceable which would create the bounds for the new words. A rule would be defined by regular expressions. With these new rules in place we could create new words all day, but we only have to create words for the "important" words that we have already found with the possibility of synonyms.
if you hardcode such a definition all the languages you generate follow that pattern, which again isnt constant over all grammar systems.mazerRackham wrote: 3. Create the grammar.
This would also be defined by regular expressions. We would define what a proper sentence would be, i.e. a noun first followed by and optional adjective and verb followed by an optional adverb followed by an effectual coma and a period. Or something like that. Define all the ways a sentence could be formed and then we are off the the races. We can now piece together all the parts of the language in order to create full texts that could be deciphered in game.
But in a good way! Gotta love that can-do attitude!Scytale wrote:I have to admit, that was a hell of a first post
Cornflakes_91 wrote:dont feel attacked by this
I agree, but I think it would be cool if humans would be able to pronounce it. This could allow for "communication" between the humans in the LT-verse and aliens. If we look at the example of Japanese, they have a phonetic alphabet for Japanese use, and one for Gaijin, or non-Japanese people. My thought is that the aliens would have their own alphabet that they can pronounce, as well as an alphabet that all races can pronounce.Cornflakes_91 wrote:an alien language doesnt have to be pronounceable by humans
pronounceable is even a flexible term between human languages, applying a metric like that to non-human languages is at best limiting.
there are also infinite amounts of character sets possible, just look at jugglers construction on the first page of this thread, its nothing like any common character set, and thats just one of the infinite sets.
True, I would not capture all of the race's culture and specific words knowable to that race, but I would capture (Hopefully!) the intersection between their language and ours by doing this. I think of the novel Stranger in a Strange Land (I highly recommend this book to anyone who likes alien cultures) we learn about the martian language from a human prospective, and learn that there just isn't a way to communicate some things in English, or any other human language, martian ideas. But most of the ideas in English can be translated to martin and martin to English. If we take our most common words then we can assume that those words can be found in an alien language. As long as we put a random component in the generation of the alien word usage (i.e. some words are in the alien language, and some are not) we can assure a complete alien feel to our alien friends (or enemies... )Cornflakes_91 wrote:you'd only create a variation of the language you are analysing, that values are not uniform over the human language space, not to speak about alien languages.
Well yes if I did hard code in a specific rule set to the grammar all "alien" languages would be the same, but since the grammar would be procedurally generated. These generated patterns would be unique to the new language, two languages have the possibility of being similar, but none would be the same, and none would seemingly be derived from English. Each language could have hundreds or thousands of rules that would define that race's grammar. (When I say grammar I am refering to the grammar in Discrete Mathematics used to define ALL languages. see http://www.cs.utsa.edu/~bylander/cs2233 ... andout.pdf)Cornflakes_91 wrote:if you hardcode such a definition all the languages you generate follow that pattern, which again isnt constant over all grammar systems.
this is not even consistent inside some languages, applying such a strict ruleset would again just produce variations of the same (in this case english i guess)
I agree that each character set on itself cannot be infinite, but there can be infinite amounts of finite character sets.mazerRackham wrote: That being said I disagree that there are an infinite number of characters sets possible. This is because no character set can be infinite or it wouldn't be defined. The best you could get is a huge number of characters say 1,000,000, but at this point the characters become meaningless because no one, not even an alien race, could remember and use that many characters at a time. eventually the language would evolve to contain a smaller set of characters that were most used. Look at Chinese, arguably the language on Earth with the largest character set has under 200,000 characters, but this counting ALL their characters not only their phonetic alphabet. Therefore it cannot be infinite just a huge number of combinations otherwise the language would not be usable.
That's not how Japanese works at all. Hiragana is used for words of Japanese origin, whereas katakana is used for loanwords, or for emphasis (like capital letters in English). It has nothing to do with the person speaking. (That said, robots are usually given katakana speech, but they're usually given capital letters in English too.)mazerRackham wrote:If we look at the example of Japanese, they have a phonetic alphabet for Japanese use, and one for Gaijin, or non-Japanese people.
Lol! We'd have to itallicize every other word -- almost none of our vocabulary is the original GermanicDigitalDuck wrote: English has (well, is supposed to have) a similar thing, where loanwords are written in italics, especially Latin. If you're writing a professional publication, citing Bird et al., mentioning something done on an ad hoc basis, referring to a priori knowledge, et cetera should all be italicised.
Users browsing this forum: No registered users and 8 guests