from jieba import posseg
s="你想去学校填写学生寒暑假住校申请表吗?"
words=posseg.cut(s,HMM=False)
print([word for word in words])

报错如下: 

Building prefix dict from D:\Program Files\python\lib\site-packages\jieba\dict.txt ...
Loading model from cache C:\Users\cflpd\AppData\Local\Temp\jieba.cache
Loading model cost 0.8404476642608643 seconds.
Prefix dict has been built succesfully.
Traceback (most recent call last):
  File "C:\Users\cflpd\Desktop\桌面文件\test\nltktest.py", line 4, in <module>
    print([word for word in words])
TypeError: __repr__ returned non-string (type bytes)

今天在使用jieba.posseg时发现以上错误,经过查询后发现,新版的posseg.cut() 把字符串处理成了生成器,因此采用以上方法,或者以下方法:

TypeError                                 Traceback (most recent call last)
<ipython-input-5-f105f6980f88> in <module>()
      1 import jieba.posseg as pseg
      2 seg_list = pseg.cut("我爱北京天安门")
----> 3 for word,flag in seg_list:
      4     print(word)
      5     print(flag)

TypeError: cannot unpack non-iterable pair object

都会出现错误,可以采用下面的方法解决:

 

 

Logo

GitCode 天启AI是一款由 GitCode 团队打造的智能助手,基于先进的LLM(大语言模型)与多智能体 Agent 技术构建,致力于为用户提供高效、智能、多模态的创作与开发支持。它不仅支持自然语言对话,还具备处理文件、生成 PPT、撰写分析报告、开发 Web 应用等多项能力,真正做到“一句话,让 Al帮你完成复杂任务”。

更多推荐