Large language models (LLMs) exhibit impressive proficiency in natural language generation, understanding user instructions, and emulating human-like language use, which has led to significant interest in their application to role-playing scenarios. However, the manual collection of role-specific script data and the evaluation of model performance are resource-intensive processes. This project introduces a prompt-based framework designed to leverage GPT's capabilities for the generation of role-playing dialogue datasets and the evaluation of role-playing performance. To validate the effectiveness of the GPT-based generation and evaluation, we further incorporate the recall-oriented Rouge-L metric, providing an additional quantitative measure of performance.
翻译:暂无翻译