Chat-GPT와 STT 기술을 활용한 회화 기능

🗣️ 프로젝트 소개

TalkFlow-KR : chat-gpt를 활용한 회화 서비스 개발

⏰ 개발 기간

2023.05.02 ~ 2023.05.15

⚙️ 개발 환경

ront-End : React, OpenAI(text-davinci-003모델 사용), STT(Speech-to-Text, React-Speech-kit API 사용)
Back-End : Node.js
Server : Mysql

✍ TalkFlow 기능

- stt api 사용해서 음성 데이터 처리
- OpenAI gpt-3 모델 사용

✍ 코드 기술

데이터베이스 설계

AQueryTool 을 사용해서 초기 설계 후 VScode extension 으로 MySQL DB 연결
설정 파일 ('config/config.json')에서 데이터베이스 연결 정보를 가져오고, sequelize 객체를 사용해서 데이터베이스에 연결

Sequelize 사용

Sequelize 사용하여 데이터베이스 모델 정의하고 연결
각 테이블에 대한 sequelize은 require 를 사용하여 정의된 파일로부터 가져오고 모델과 데이터베이스 간의 관계 설정
sequlize.sync() 메서드
- Sequelize 모델과 데이터베이스 동기화
- {force: false} : 데이터베이스가 이미 존재하면 테이블 생성X
테이블 별 세부적 설정
- freezeTableName: true
  - 테이블 이름 고정, Sequelize의 자동 복수형 규칙 적용X
- timestamps: false
  - 데이터가 추가, 수정 되는 시간 기록X

GPT-3.5 모델로 메시지 처리와 대화 수행

exports.msg = async (req, res) => {
  const userid = req.params.userid;
  const roomid = req.params.roomid;

  const result = await models.MSG.findAll({
    raw: true,
    where: {
      room_id: roomid,
      user_id: userid,
    },
  });

  let newMsg = [];
  for (let i = 0; i < result.length; i++) {
    const { msg_id, room_id, user_id, ...others } = result[i];
    newMsg.push(others);
  }
  res.send(newMsg);
};

msg
- userid와 roomid를 기반으로 models.MSG에서 해당 메시지를 조회
- models.MSG.findAll
  - MSG 테이블에서 일치하는 roomid, userid 검색
  - raw: true
    - 반환되는 데이터가 원시 JSON 형식

exports.runGPT35 = async (req, res) => {
  // MSG 정보 가져오기
  const userid = req.params.userid;
  const roomid = req.params.roomid;

  const result = await models.MSG.findAll({
    raw: true,
    where: {
      room_id: roomid,
      user_id: userid,
    },
  });
  console.log("res: ", result); // [ {}, {}, {}, ... ]

  if (result.length > 0) {
    // MSG 테이블이 비어있지 않다면
    let newMsg = [];	// 빈 배열 생성
    // result의 각 행을 반복
    for (let i = 0; i < result.length; i++) {
      newMsg.push({ role: result[i].part_id, content: result[i].content });
    }

    // 과거내역 불러오기
    const response = await openai.createChatCompletion({
      model: "gpt-3.5-turbo",
      messages: [...newMsg, { role: "user", content: req.body.msg }],
    });

    await models.MSG.create({
      part_id: "user",
      content: req.body.msg,
      room_id: roomid,
      user_id: userid,
    });
    await models.MSG.create({
      part_id: response.data.choices[0].message.role,
      content: response.data.choices[0].message.content,
      room_id: roomid,
      user_id: userid,
    });
    res.send(response.data.choices[0].message.content); // 답변 반환
  } else {
    // MSG 테이블이 비었다면 ROOM에 저장된 세팅 값으로 gpt 세팅
    const settings = await models.ROOM.findOne({
      raw: true,
      where: {
        room_id: roomid,
      },
    });
    console.log("set :", settings);	// 방 설정 값
    const situation = settings.situation;
    const accent = settings.accent;
    const language = settings.language;
	// situation(상황), accent(강조),language(언어)로 msg(시스템 메시지) 생성
    const msg = `Let's play a role play. you can play any role in ${situation}.
                   but you must use ${language} and please speak with ${accent} accent.`;

    const response = await openai.createChatCompletion({
      model: "gpt-3.5-turbo",
      messages: [
        { role: "system", content: msg },
        { role: "user", content: req.body.msg },
      ],
    });

    // DB 추가
    await models.MSG.create({
      part_id: "system",
      content: msg,
      room_id: roomid,
      user_id: userid,
    });

    await models.MSG.create({
      part_id: "user",
      content: req.body.msg,
      room_id: roomid,
      user_id: userid,
    });

    await models.MSG.create({
      part_id: response.data.choices[0].message.role,
      content: response.data.choices[0].message.content,
      room_id: roomid,
      user_id: userid,
    });
    res.send(response.data.choices[0].message.content); // 답변 반환
  }
};

runGPT35
- models.MSG에서 과거 메시지 조회
- 조회 내역을 기반으로 OpenAI API 호출 후 대화 진행
- API 호출 결과를 기반으로 새로운 메시지를 models.MSG에 저장 후 답변을 Client에게 반환
- 조회된 과거 메시지가 없으면, models.ROOM 에서 대화 설정 값을 조회하여 메시지 생성 후 저장

📝 배운 점

GPT-3.5 모델 사용

회화 학습을 위한 중요한 기능을 GPT-3.5를 활용해 완성시킬 수 있었다. GPT에 학습되어 있는 데이터에서 지능적으로 작동하며 텍스트 완성과 번역 질문 답변까지 높은 작업이 가능했다. situation(상황), accent(강조),language(언어)로 msg(시스템 메시지) 생성해서 GPT에게 반환한다면, 다른 부가적인 코드없이 사용자에게 적합한 서비스를 제공할 수 있었다.

React-Speech-Kit

음성 인식 기능을 추가하기 위해 Wep Speech API 를 이용했다. 리액트에 적용한 React-Speech-Kit 를 사용해 쉽게 구현할 수 있었다. 말하기 전, 말을 인식 중일 때, 말이 끝나고 나서 3가지의 상황을 나눠서 구별해야 했다. 프론트가 중점적으로 개발한 부분이었지만 새로운 api를 쓸 수있었다는 점에서 코드에 관심이 갔었다.

❗ 추가할 기능

- 서버 배포
- 수정 예정

✏️ 기록

https://github.com/TalkFlow-KR/TalkFlow-KR

GitHub - TalkFlow-KR/TalkFlow-KR

Contribute to TalkFlow-KR/TalkFlow-KR development by creating an account on GitHub.

github.com

저작자표시 변경금지

'[Study] BE > Node.js' 카테고리의 다른 글

Sokect.io를 활용한 채팅 서비스 (0)	2023.04.20
[포스코x코딩온] Socket (0)	2023.04.18
[포스코x코딩온] Cookie, Session (0)	2023.04.12
[포스코x코딩온] Sequelize (0)	2023.04.09
[포스코x코딩온] MVC_MySQL (1)	2023.04.09