5

Background:

I am currently trying to crack a salted md5 hash which I have recovered from an embedded device running busybox and manufactured in china.

I have tried john the ripper using all of the wordlists I could find in Kali.

I have tried cloudcracker and that returned no results.

I am currently brute-forcing on a machine with several GPUS using oclHashcat but haven't got anything back yet.

I have a serial console on the device which is in English. I have a low privileged account with the username jdoe and no password. I have handed this to someone who is much better at Linux privesc than I am.

I want to ensure that the device does not have a noddy password in Chinese. I.e. whatever the equivalent of "password" "hello" "letmein" etc. would be if you were a electronics factory in Shenzhen.

I am aware that there are multiple Chinese dialects but would be interested in wordlists from any of them as it would be worth a try anyway and they may be useful on similar devices.


TLDR: I have tried Google and had difficulty finding Chinese or "pinyin" wordlists suitable for dictionary attacks. I have searched this site and found some good sources for wordlists which I will probably try next but they are mainly in English and I have not seen any in Chinese.

Does anyone know of a good source for them?


Edit: Further searching has found me this http://www.backtrack-linux.org/forums/showthread.php?t=27764 which I will try. I am unsure if that will provide decent coverage for my purposes though.


Edit 2:

I have posted an ascii seed symbol wordlist generated from the top answer below at http://pastebin.com/aHrnZF57

This should be able to be used to generate wordlists of your desired length by using the "crunch" wordlist generator or your favourite alternative tool.

For example:

Save the raw pastebin data as seed.txt in your home directory then invoke:

crunch MIN MAX -q ~\seed.txt -o myOutputWordlist.txt

The same can be done by substituting my seed list with some of those available from the backtrack-linux forums thread linked above.

Hope this helps anyone with this problem - Happy bruteforcing :)

1
  • There's only a small amount of pinyin possibilities, if you really wanted to you could just look up initials and finals and combine them and you'll have the full... what?... 400 possible combinations? Commented Sep 13, 2014 at 8:32

1 Answer 1

11

Chinese (Mandarin) is a very phonologically restrictive language with a limited amount of possible syllables. Hence, you can make a table consisting of every possible syllable in the Chinese language as exemplified here:

 a ai au an aŋ e ə əi əu ən əŋ i ia iai iau ian iaŋ ie iə iəu iən iəŋ io iu iuan iuə iun iuəŋ aɻ o u ua uai uan uaŋ uə uəi uən uəŋ m n ŋ a ai ao an ang ê e ei ou en eng yi ya yai yao yan yang yê ye you yin ying yo yu yuan yue yun yong er o wu wa wai wan wang wo wei wun weng m n ng p ba bai bao ban bang bo bei ben beng bi biao bian biang bie biu bin bing bu pʰ pa pai pao pan pang po pei pou pen peng pi piao pian pie piu pin ping pu m ma mai mao man mang mo mei mou men meng mi miao mian mie miu min ming mu f fa fan fang fo fei fou fen feng fu t da dai dao dan dang de dei dou den deng di diao dian die diu ding du duan duo dui dun dong tʰ ta tai tao tan tang te tou teng ti tiao tian tie tiu ting tu tuan tuo tui tun tong n na nai nao nan nang ne nei nou nen neng ni niao nian niang nie niu nin ning nü nüe nu nuan nuo nong l la lai lao lan lang le lei lou leng li lia liao lian liang lie liu lin ling lü lüe lo lu luan luo lun long ts za zai zao zan zang ze zei zou zen zeng zi zu zuan zuo zui zun zong tsʰ ca cai cao can cang ce cou cen ceng ci cu cuan cuo cui cun cong s sa sai sao san sang se sou sen seng si su suan suo sui sun song tʂ zha zhai zhao zhan zhang zhe zhei zhou zhen zheng zhi zhu zhua zhuai zhuan zhuang zhuo zhui zhun zhong tʂʰ cha chai chao chan chang che chou chen cheng chi chu chua chuai chuan chuang chuo chui chun chong ʂ sha shai shao shan shang she shei shou shen sheng shi shu shua shuai shuan shuang shuo shui shun ɻ rao ran rang re rou ren reng ri ru rua ruan ruo rui run rong tɕ ji jia jiao jian jiang jie jiu jin jing ju juan jue jun jiong tɕʰ qi qia qiao qian qiang qie qiu qin qing qu quan que qun qiong ɕ xi xia xiao xian xiang xie xiu xin xing xu xuan xue xun xiong k ga gai gao gan gang ge gei gou gen geng gu gua guai guan guang guo gui gun gong kʰ ka kai kao kan kang ke kei kou ken keng ku kua kuai kuan kuang kuo kui kun kong x ha hai hao han hang he hei hou hen heng hu hua huai huan huang huo hui hun hong 

The first row are the column headings and the first column are the row headings. Thus, you can ignore the first row and column as they were for my aid in learning Mandarin pronounciation.

You may want to substitute ü with v as that is how it is inputted on a keyboard since no pinyin syllable uses v. (It’s quite a bit more accessible than typing an accented character.)

Many syllables are not used and their positions are, thus, empty in the table giving a total of 410 possible syllables. Furthermore, there are 5 tones (beginning with 0) in the Mandarin Chinese language denoted by numbers, so a password could be something like gao1mi4. Counting the tones, there are 2050 possible syllables. With Chinese words being typically 1 to 3 syllables long, it's a real possibility that you could break a Pinyin-based password in less than 8,615,125,000 attempts using random combinations of syllables and tones without the aid of a word list.

3
  • Awesome - thanks. Will try and build a list from this and possibly use markov chaining to see if I can improve it. Will upvote when I have the rep and post my list as a comment to my question. Commented Apr 16, 2014 at 8:05
  • If I may add some more hints, typing Pinyin in an IME does not require the use of the SHIFT or CAPS LOCK keys. You might want to start with all lowercase. I can also see a Chinese user will probably omit the tone number altogether. The Chinese military was known to send messages in pinyin without the tones because they could be understood without the tones anyway. Commented Apr 18, 2014 at 21:18
  • 1
    I've added a textual table to replace the graphic one I took from Pinyin.info. This one can be copied into other software. If you want a challenge, there are 4 standardized vocabulary lists from the Taiwanese government. Basically, the words on that list are the ones people will most likely use. They are listed with the Pinyin, although they use tone marks and not tone numbers. sc-top.org.tw/download/800Words_Beginners.pdf sc-top.org.tw/download/8000_Basic.pdf sc-top.org.tw/download/8000_Intermediate.pdf sc-top.org.tw/download/8000_Advanced.pdf Commented Apr 18, 2014 at 22:12

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.