Muse Notes
TATN! (These are the notes!)
- 6/6/2010
- Published a nerfed, partial, projected dump of the Muse rhyming data.
- 1/10/2010
- Added a meaning-based "related words" exploder, in the new "Related" tab.
- 1/21/2009
- Dillfrog Yo Suite options are enabled by default, to give more useful results and faster response time.
- 1/18/2009
- Added "Contains (sound, not spelling)", and "Contains (spelling, not sound)".
- 1/16/2009
- Added "Vowel Sounds (exact)" and "Consonant Sounds (exact)" search types. Added "Gotta Have Meanings, Yo!" option, to filter out words that we don't personally have definitions for. Added search field to Meaning tab. Added "Notes" tab, to draw attention to this page. Added substring searches (now called "Begins With (spelling)", "Ends With (spelling)", and "Contains (spelling)").
- 7/3/2008
- Added "Rhyme (any)" search, that returns both rhymes and off-rhymes. You can still retrieve the isolated results if you prefer those, using "Rhyme (slant)" and "Rhyme (perfect)" flavors. Added the "Word Type" selector you know and love. Added ability to search for words beginning with, ending with, or containing a certain word's pronunciation.
- We've got some atypical pronunciations.
-
For example, try searching for perfect rhymes of "home". What's "room" doing in there? Well, there's a last name of "Home" (a proper noun) which is prounounced "Hume". Since that "Hume" pronounciation is in our system as well, you'll see words like "broom", "zoom", alongside "gnome", "roam", etc in the search results. Or vice versa, when searching for rhymes of "broom", might see "home" in there.
Another example: "bung" can be pronounced as it's spelled (B AH NG), or with a silent G (B AH N).
For now you're stuck with it. In the future I'd like to give you the option to exclude certain pronunciations. Either that, or I'll have to be more picky about which proper nouns make it into our pronunciation dictionary. If you see any other strange pronunciations in there, let me know!
- In the pronunciation data, adjective satellites are not causing is_adjective to be set.
- Need to pdate and test the table builder to make sure is_adjective is set for both heads and satellites. I updated the table manually for now, but that'll be lost on the next big build. Plus, it only fixes things for words that have the satellite as the primary POS.
- "paperwork" had 4 syllables
- Not sure if that's a pronunciation issue or a syllable-counting issue. For now I've removed "P AH IH P AH W AH R K" (uphone) from the list, but there's nothing stopping it from coming up again later. Thanks for the heads-up, Mark!
- "slum" supposedly rhymes with "m"
- M is said to be pronouncable like "um" (AH1 M), which is odd. Removed the entry. Also removed N's "AH1 N". I'm thinking our phoneme mapping is wrong-o.
- "shut", "rut", and "coconut" are shown as perfect rhymes for "that".
- "That" had a pronunciation of "DH AH1 T" ('thut', vs DH AE1 T) which doesn't serve us well here. I removed it from this instance of the data but since I didn't fix it at the source, it'll likely creep up on the next data build. Thanks for the heads-up, T.C.!
- Meaning of "routes" is moved to "rout". Why not "route"?
- Time to review the morphing code.
- "Group By Familiarity" is nearly useless
- The familiarity calculation needs work. I need to either use a different approach, a different corpus, or to just remove it entirely. I don't think it's a very popular feature anyway.
- "Only Common Words, Yo" filters out many hyphenated words
- ...because the way the corpus is split, most of those words don't show up. Probably need to interpolate my own value based on the known corpus words. That, or call it a feature.
- More phonemes match than we want, because of wildcarding
- Example: ??? Oh come on, don't tell me I actually did this properly. There has to be an example.
- Crashes when you search for a rhyme of a word that's not in our dictionary.
- Oops.
- Some words' part-of-speech (POS) values are unknown.
- For example, try searching for perfect rhymes of "paste", and make sure you Group by Word Type. Words like "defaced", "displaced", and "embraced" are classified as an Unknown POS, but you can click the words to view their Wordnet entries! Something's wrong with my Java lemmatizer.
- abdicated rhymes with itself (problem: using D AH T EY K AH D B E% rather than D AH T EY K AH D B AE%
- Allow slicing rhyme further, past default syllable boundary. Write general-case slicer.
Contact Me
Got something on your mind? And can you communicate it clearly? Awesome. In that case:
E-mail: plat@dillfrog.com
And don't forget about the announcement list of course.