ImgtoPoem

写在前面

理想中的实现是输入一张图之后会直接返回一句诗/词/曲（检索优先，如果检索不到再AI生成），即图片 $\Rightarrow$ 诗句。

但在实践中，暂时的实现是图片 $\Rightarrow$ 英文 $\Rightarrow$ 中文（现代白话文） $\Rightarrow$ 诗句。这中间可能出现两个gap：

图片-中文，生成文本可能不够丰富，针对这一点提供了文本框用于补充信息
白话文-诗句，暂时两种解决方案：
- 使用有翻译的诗句，对应文件 ct_data.jsonl，类似Hypothetical Questions的形式存储数据库，缺点是根据数据格式有明确对应翻译的诗句数量有限
- 使用所有诗句，对应文件 c_data.jsonl，具体而言又有两种方案
  - 微调嵌入模型：{query:翻译,pos:[诗句文本],neg:[负样本(诗句))]}，然后使用白话文作为query去检索诗句
  - 微调嵌入模型：{query:翻译+llm生成的诗句,pos:[诗句文本],neg:[负样本(诗句))]}，在查询时，先让大模型生成一个答案，然后查询+答案拼接去向量数据库检索诗句，或者 {query:llm生成的诗句,pos:[诗句文本],neg:[负样本(诗句))]}

目前重点放在了“白话文-诗句”gap的解决，但由于“image to poem”是最初的构想，上传图片部分没有取消。

运行

在 LOCALPATH.py内指定本地路径，主要是ENV_PATH、RERANK_PATH和EMBEDDING_PATH
如果不想使用图生文模型，可直接在文本框内填写相关信息
创建虚拟环境：conda create -n imgtopoem python=3.10

安装相关：

conda activate imgtopoem
pip install -r requirements.txt

python run.py

结构

说明：[]表示不在repo中但可由“来源”或者python文件（在此目录中）得到的内容

data

data
├── [works.json]: 数据来自https://github.com/VMIJUNV/chinese-poetry-and-prose
├── [c_data.jsonl]: 由prepare_data.ipynb和works.json生成
├── [ct_data.jsonl]: 由prepare_data.ipynb和works.json生成
├── prepare_c_db.ipynb
├── prepare_ct_db.ipynb
├── prepare_data.ipynb
├── [select_c_data.jsonl]: 由prepare_data.ipynb和works.json生成，c_data.jsonl的替代
├── vectordb_ct
│   └── faiss
│       ├── index.faiss
│       └── index.pkl
└── vectordb_select_c
    └── faiss
        ├── index.faiss
        └── index.pkl

tools

tools
├── imgtotext.py: 图转文
└── response.py

结果

原图片来自: https://unsplash.com/t/nature

https://unsplash.com/photos/a-snow-covered-mountain-range-with-a-clear-sky-Je7XqcBmDFg?utm_content=creditShareLink&utm_medium=referral&utm_source=unsplash

Photo by <a href="https://unsplash.com/@eugene_golovesov?utm_content=creditCopyText&utm_medium=referral&utm_source=unsplash">Eugene Golovesov</a> on <a href="https://unsplash.com/photos/a-group-of-trees-that-are-in-the-snow-z994gPo74ck?utm_content=creditCopyText&utm_medium=referral&utm_source=unsplash">Unsplash</a>

https://unsplash.com/photos/a-snow-covered-mountain-range-with-a-clear-sky-Je7XqcBmDFg?utm_content=creditShareLink&utm_medium=referral&utm_source=unsplash

Photo by <a href="https://unsplash.com/@marekpiwnicki?utm_content=creditCopyText&utm_medium=referral&utm_source=unsplash">Marek Piwnicki</a> on <a href="https://unsplash.com/photos/a-snow-covered-mountain-range-with-a-clear-sky-Je7XqcBmDFg?utm_content=creditCopyText&utm_medium=referral&utm_source=unsplash">Unsplash</a>

其他

微调Embedding

微调LLM

TODO

reranker微调
尝试Chinese-CLIP

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ImgtoPoem

写在前面

运行

结构

data

tools

结果

其他

TODO

参考

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
model		model
tools		tools
LOCALPATH.py		LOCALPATH.py
README.md		README.md
requirements.txt		requirements.txt
run.py		run.py

time1527/img-to-poem

Folders and files

Latest commit

History

Repository files navigation

ImgtoPoem

写在前面

运行

结构

data

tools

结果

其他

TODO

参考

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages