Imaginary Alphabets

TheJuggler wrote:When I got bored in highschool i developed a habit of designing writing systems. I never got into conlanging, though, so they were all for english.

Oh my god, I did this too! It looked more like Arabic though, convincingly enough that several people thought I had converted to Islam. (I still find Arabic to be one of the prettiest looking languages... if only it sounded nearly as pretty). The symbols were based on math, but it was still just plain english; the vowels were exponential curves, the consonants were angles or repeating lines softly written into each other. I actually employed this language occasionally for cheating on tests to useful effect

can't say I remember it though, it was just one of my many little projects as a teenager.

Challenging your assumptions is good for your health, good for your business, and good for your future. Stay skeptical but never undervalue the importance of a new and unfamiliar perspective.
Imagination Fertilizer
Beauty may not save the world, but it's the only thing that can

Quote

Re: Imaginary Alphabets

Fri Dec 05, 2014 7:24 pm

#21 by TheJuggler

Hyperion wrote:
TheJuggler wrote:When I got bored in highschool i developed a habit of designing writing systems. I never got into conlanging, though, so they were all for english.
Oh my god, I did this too! It looked more like Arabic though, convincingly enough that several people thought I had converted to Islam. (I still find Arabic to be one of the prettiest looking languages... if only it sounded nearly as pretty). The symbols were based on math, but it was still just plain english; the vowels were exponential curves, the consonants were angles or repeating lines softly written into each other. I actually employed this language occasionally for cheating on tests to useful effect

People jump to such wild conclusions sometimes

But I totally agree with you -- written arabic looks so graceful. I think the language sounds...well, if not "pretty" then at least really cool.

Limit Theory IRC

Quote

Re: Imaginary Alphabets

Sat Dec 06, 2014 2:11 pm

#22 by mazerRackham

An entire procedural language could be created fairly easily. Well that's the understatement of the year, but the mechanics of making it wouldn't be all that hard. The steps as I see them are as follows. (Feel free to modify/enhance the steps as you see fit).

1. Create an alphabet using a variable number of characters in the character set.
I would personally use the IPA as a base for all the sounds that can be made. This is fortunately a finite list. This allows one to create a language that is actually pronounceable. You would have to find a proper balance between consonants and vowels in order for this to be practical. Assign each sound to a number (much like we have assigned each letter of the alphabet to an ASCII number).
After assigning the sounds of our new language number values we can make the glyph set, which most people would define as the "fun" part... (Don't forget punctuation!!!)

2. Create words.
This would be the hard part, but in my mind the "fun" part. I would take a huge language data set, say all of the English Wikipedia, and use something like this ->http://en.wikipedia.org/wiki/Tf%E2%80%93idf in order to find the most "important" words. That's the easy part. The hard part would be to define the parts of speech i.e. nouns, verbs, adjectives, etc.
After we find the "important" words comes the simple part of creating a large group of words using our new alphabet according to rules. These rules would once again be procedurally generated but some things are just not pronounceable which would create the bounds for the new words. A rule would be defined by regular expressions. With these new rules in place we could create new words all day, but we only have to create words for the "important" words that we have already found with the possibility of synonyms.

3. Create the grammar.
This would also be defined by regular expressions. We would define what a proper sentence would be, i.e. a noun first followed by and optional adjective and verb followed by an optional adverb followed by an effectual coma and a period. Or something like that. Define all the ways a sentence could be formed and then we are off the the races. We can now piece together all the parts of the language in order to create full texts that could be deciphered in game.

Quote

Re: Imaginary Alphabets

Sat Dec 06, 2014 3:02 pm

#23 by TheJuggler

mazerRackham wrote:An entire procedural language could be created fairly easily. Well that's the understatement of the year, but the mechanics of making it wouldn't be all that hard. The steps as I see them are as follows. (Feel free to modify/enhance the steps as you see fit).

1. Create an alphabet using a variable number of characters in the character set.
I would personally use the IPA as a base for all the sounds that can be made. This is fortunately a finite list. This allows one to create a language that is actually pronounceable. You would have to find a proper balance between consonants and vowels in order for this to be practical. Assign each sound to a number (much like we have assigned each letter of the alphabet to an ASCII number).
After assigning the sounds of our new language number values we can make the glyph set, which most people would define as the "fun" part... (Don't forget punctuation!!!)

2. Create words.
This would be the hard part, but in my mind the "fun" part. I would take a huge language data set, say all of the English Wikipedia, and use something like this ->http://en.wikipedia.org/wiki/Tf%E2%80%93idf in order to find the most "important" words. That's the easy part. The hard part would be to define the parts of speech i.e. nouns, verbs, adjectives, etc.
After we find the "important" words comes the simple part of creating a large group of words using our new alphabet according to rules. These rules would once again be procedurally generated but some things are just not pronounceable which would create the bounds for the new words. A rule would be defined by regular expressions. With these new rules in place we could create new words all day, but we only have to create words for the "important" words that we have already found with the possibility of synonyms.

3. Create the grammar.
This would also be defined by regular expressions. We would define what a proper sentence would be, i.e. a noun first followed by and optional adjective and verb followed by an optional adverb followed by an effectual coma and a period. Or something like that. Define all the ways a sentence could be formed and then we are off the the races. We can now piece together all the parts of the language in order to create full texts that could be deciphered in game.

Although this is probably enough to satisfy a lot of people, unfortunately it isnt gonna do it for me. I mean, I hate to burst your bubble, but what you have described is a relex of english, maybe with some altered word order here and there. Languages are much more flexible than that -- I'm thinking especially of agglutinative langs here.

That being said, a relex of english might be the best we can do with procedural generation

Welcome to the forums!

(Also, I'm loving the username

)

Limit Theory IRC

Quote

Re: Imaginary Alphabets

Sat Dec 06, 2014 3:05 pm

#24 by Cornflakes_91

dont feel attacked by this

mazerRackham wrote: 1. Create an alphabet using a variable number of characters in the character set.
I would personally use the IPA as a base for all the sounds that can be made. This is fortunately a finite list. This allows one to create a language that is actually pronounceable. You would have to find a proper balance between consonants and vowels in order for this to be practical. Assign each sound to a number (much like we have assigned each letter of the alphabet to an ASCII number).
After assigning the sounds of our new language number values we can make the glyph set, which most people would define as the "fun" part... (Don't forget punctuation!!!)

an alien language doesnt have to be pronounceable by humans

pronounceable is even a flexible term between human languages, applying a metric like that to non-human languages is at best limiting.

there are also infinite amounts of character sets possible, just look at jugglers construction on the first page of this thread, its nothing like any common character set, and thats just one of the infinite sets.

mazerRackham wrote: 2. Create words.
This would be the hard part, but in my mind the "fun" part. I would take a huge language data set, say all of the English Wikipedia, and use something like this ->http://en.wikipedia.org/wiki/Tf%E2%80%93idf in order to find the most "important" words. That's the easy part. The hard part would be to define the parts of speech i.e. nouns, verbs, adjectives, etc.
After we find the "important" words comes the simple part of creating a large group of words using our new alphabet according to rules. These rules would once again be procedurally generated but some things are just not pronounceable which would create the bounds for the new words. A rule would be defined by regular expressions. With these new rules in place we could create new words all day, but we only have to create words for the "important" words that we have already found with the possibility of synonyms.

you'd only create a variation of the language you are analysing, that values are not uniform over the human language space, not to speak about alien languages.

mazerRackham wrote: 3. Create the grammar.
This would also be defined by regular expressions. We would define what a proper sentence would be, i.e. a noun first followed by and optional adjective and verb followed by an optional adverb followed by an effectual coma and a period. Or something like that. Define all the ways a sentence could be formed and then we are off the the races. We can now piece together all the parts of the language in order to create full texts that could be deciphered in game.

if you hardcode such a definition all the languages you generate follow that pattern, which again isnt constant over all grammar systems.
this is not even consistent inside some languages, applying such a strict ruleset would again just produce variations of the same (in this case english i guess)

regardless of my deconstruction: welcome to the forums

Limit Theory IRC Channel FAQ big QA Thread Retroshare

Quote

Re: Imaginary Alphabets

Sat Dec 06, 2014 3:13 pm

#25 by Scytale

I have to admit, that was a hell of a first post

Quote

Re: Imaginary Alphabets

Sat Dec 06, 2014 3:27 pm

#26 by TheJuggler

Scytale wrote:I have to admit, that was a hell of a first post

But in a good way! Gotta love that can-do attitude!

Limit Theory IRC

Quote

Re: Imaginary Alphabets

Sat Dec 06, 2014 4:20 pm

#27 by mazerRackham

Cornflakes_91 wrote:dont feel attacked by this

Don't worry, I started my original post asking for criticism, so thanks!!!

Cornflakes_91 wrote:an alien language doesnt have to be pronounceable by humans

pronounceable is even a flexible term between human languages, applying a metric like that to non-human languages is at best limiting.

there are also infinite amounts of character sets possible, just look at jugglers construction on the first page of this thread, its nothing like any common character set, and thats just one of the infinite sets.

I agree, but I think it would be cool if humans would be able to pronounce it. This could allow for "communication" between the humans in the LT-verse and aliens. If we look at the example of Japanese, they have a phonetic alphabet for Japanese use, and one for Gaijin, or non-Japanese people. My thought is that the aliens would have their own alphabet that they can pronounce, as well as an alphabet that all races can pronounce.

That being said I disagree that there are an infinite number of characters sets possible. This is because no character set can be infinite or it wouldn't be defined. The best you could get is a huge number of characters say 1,000,000, but at this point the characters become meaningless because no one, not even an alien race, could remember and use that many characters at a time. eventually the language would evolve to contain a smaller set of characters that were most used. Look at Chinese, arguably the language on Earth with the largest character set has under 200,000 characters, but this counting ALL their characters not only their phonetic alphabet. Therefore it cannot be infinite just a huge number of combinations otherwise the language would not be usable.

Cornflakes_91 wrote:you'd only create a variation of the language you are analysing, that values are not uniform over the human language space, not to speak about alien languages.

True, I would not capture all of the race's culture and specific words knowable to that race, but I would capture (Hopefully!) the intersection between their language and ours by doing this. I think of the novel Stranger in a Strange Land (I highly recommend this book to anyone who likes alien cultures) we learn about the martian language from a human prospective, and learn that there just isn't a way to communicate some things in English, or any other human language, martian ideas. But most of the ideas in English can be translated to martin and martin to English. If we take our most common words then we can assume that those words can be found in an alien language. As long as we put a random component in the generation of the alien word usage (i.e. some words are in the alien language, and some are not) we can assure a complete alien feel to our alien friends (or enemies...

)

Cornflakes_91 wrote:if you hardcode such a definition all the languages you generate follow that pattern, which again isnt constant over all grammar systems.
this is not even consistent inside some languages, applying such a strict ruleset would again just produce variations of the same (in this case english i guess)

Well yes if I did hard code in a specific rule set to the grammar all "alien" languages would be the same, but since the grammar would be procedurally generated. These generated patterns would be unique to the new language, two languages have the possibility of being similar, but none would be the same, and none would seemingly be derived from English. Each language could have hundreds or thousands of rules that would define that race's grammar. (When I say grammar I am refering to the grammar in Discrete Mathematics used to define ALL languages. see http://www.cs.utsa.edu/~bylander/cs2233 ... andout.pdf)

Thanks Cornflakes for the criticism, hopefully this clears some of my ideas up!

Quote

Re: Imaginary Alphabets

Sat Dec 06, 2014 4:27 pm

#28 by Cornflakes_91

mazerRackham wrote: That being said I disagree that there are an infinite number of characters sets possible. This is because no character set can be infinite or it wouldn't be defined. The best you could get is a huge number of characters say 1,000,000, but at this point the characters become meaningless because no one, not even an alien race, could remember and use that many characters at a time. eventually the language would evolve to contain a smaller set of characters that were most used. Look at Chinese, arguably the language on Earth with the largest character set has under 200,000 characters, but this counting ALL their characters not only their phonetic alphabet. Therefore it cannot be infinite just a huge number of combinations otherwise the language would not be usable.

I agree that each character set on itself cannot be infinite, but there can be infinite amounts of finite character sets.

And by choosing one set out of that infinite amount you limit yourself to an infinitely small part of that infinite space.

That reminds me: i could actually start learning some linguistics

Limit Theory IRC Channel FAQ big QA Thread Retroshare

Quote

Re: Imaginary Alphabets

Sat Dec 06, 2014 6:34 pm

#29 by DigitalDuck

mazerRackham wrote:If we look at the example of Japanese, they have a phonetic alphabet for Japanese use, and one for Gaijin, or non-Japanese people.

That's not how Japanese works at all. Hiragana is used for words of Japanese origin, whereas katakana is used for loanwords, or for emphasis (like capital letters in English). It has nothing to do with the person speaking. (That said, robots are usually given katakana speech, but they're usually given capital letters in English too.)

English has (well, is supposed to have) a similar thing, where loanwords are written in italics, especially Latin. If you're writing a professional publication, citing Bird et al., mentioning something done on an ad hoc basis, referring to a priori knowledge, et cetera should all be italicised.

Games I like, in order of how much I like them. (Now permanent and updated regularly!)

Quote

Re: Imaginary Alphabets

Sat Dec 06, 2014 11:42 pm

#30 by TheJuggler

@mazer:
I was trying to point out that foreign grammars are more than just different word orders. For instance, some languages mark clauses for evidentiality (e.g. did the speaker witness it, hear about it, or supposes that it must have happened?). Some languages (such as Spanish) don't require the use of pronouns at all (unless you are emphasizing something) because that information is already contained in the verb itself. Mandarin chinese does not have a past tense -- it instead uses aspect, a system which indicates to what extent an action has been completed, to indirectly indicate time. Then there's reduplication. For example, in swahili, "piga" means "to strike," and "pigapiga" means "to strike repeatedly." And some languages -- dubbed "polysynthetic," are capable of expressing entire sentences of information into a single word (think of the. Spanish example, but taken waaaay past 11). According to wikipedia, the Yupik word "tuntussuqatarniksaitengqiggtuq" means "He had not yet said again that he was going to hunt rereindeer." The word consists of the morphemes tuntu-ssur-qatar-ni-ksaite-ngqiggte-uq, and, as wikipedia helpfully notes, none of them (except "tuntu") can appear on their own.

These are just a few examples, but I think they do an excellent job of illustrating the wonderful diversity that can be found among the world's languages.

Maybe instead of trying to procedurally gen. all languages we should focus on generating different "types."
(i.e. an algorithm for isolating langs, an algorithm for agglutinative langs, etc.)

Also, I'm glad that you are so accepting of critiques!

...I still feel like I'm beating you over the head, though.

DigitalDuck wrote: English has (well, is supposed to have) a similar thing, where loanwords are written in italics, especially Latin. If you're writing a professional publication, citing Bird et al., mentioning something done on an ad hoc basis, referring to a priori knowledge, et cetera should all be italicised.

Lol! We'd have to itallicize every other word -- almost none of our vocabulary is the original Germanic

Limit Theory IRC

Quote

Display posts from previous

Sort by

Online Now