AWS의 딥러닝 기반 이미지/동영상 분석 서비스인 AWS Rekognition 서비스에 대한 질문이 종종 들어와서 직접 Rekognition 서비스를 활용하여 소스를 구성해보고, 어떤 식으로 활용할 수 있을지 간단한 테스트 구성을 진행해봤습니다. 간단히 테스트만 하고 게시물은 따로 안올리려고 했는데 막상 해보니 흥미롭기도 하고 사용법이 간단해서 한 번 소개해드리고자 합니다.

1. AWS Rekognition 서비스란

구성하고자하는 테스트 내용에 대해 말씀드리기 전에 rekognition 서비스에 대해 간단히 소개해드리려고 하는데요. rekognition 서비스는 딥러닝을 통한 이미지/동영상 분석에 사용되는 서비스이며, API 형태로 제공됩니다. 분석 가능한 주요 기능은 아래와 같으며, 테스트 환경에서 사용할 기능은 크게 이미지의 요소를 인식하여 키워드를 찾아내는 레이블 기능과, 이미지에 포함된 요소 중 부적절한 레이블에 가까운 요소를 찾아내는 콘텐츠 조정 기능입니다.

2. python 스크립트(boto3)로 rekognition API 활용해보기(이미지의 키워드 찾기)

rekognition 서비스는 node.js, java, php, python 등 여러 SDK를 지원하지만 테스트 환경에서는 python SDK인 boto3 모듈을 이용하여 간단히 사용해보도록 하겠습니다. (지원하는 SDK 목록 - https://aws.amazon.com/ko/rekognition/resources/?nc=sn&loc=6 )

먼저 rekognition 서비스로 분석할 테스트 이미지가 필요합니다. 이미지는 S3 버킷에 저장하여 불러오거나, 로컬의 이미지를 바이트 문자열로 읽어올 수 있으나, 바이트 문자열로 이미지를 불러오는 경우 단일 이미지 크기에 5MB의 제한이 있습니다. 테스트 환경에서는 S3 버킷에 저장된 이미지를 활용하도록 하겠습니다. 테스트에 사용할 표본 이미지는 임의로 쇼핑몰 등에서 다운로드하였으며 아래와 같습니다.
(아래 첨부된 이미지는 모자이크를 처리하였으나, 실제 테스트 시에는 모자이크 처리를 하지 않은 선명한 이미지를 기준으로 진행하였습니다.)

이제 분석에 사용할 이미지는 준비되었으니 python 스크립트를 작성합니다. 스크립트 작성 전 boto3 모듈을 설치합니다.

pip install boto3

python 스크립트 소스는 아래와 같이 구성합니다.

# S3 버킷에 업로드된 이미지 분석하여 레이블 부여 소스
import boto3

def detect_labels(photo, bucket):

    client=boto3.client('rekognition')

    response = client.detect_labels(Image={'S3Object':{'Bucket':bucket,'Name':photo}})

    print('Detected labels in ' + photo)    
    for label in response['Labels']:
        print (label['Name'] + ' : ' + str(label['Confidence']))

    return len(response['Labels'])

def main():
    photo='upload-img/child.jpg' # 분석할 이미지의 키(경로+파일명)입니다. 분석할 이미지에 맞게 수정합니다.
    bucket='premisan-test' # 이미지가 저장된 S3 버킷의 이름입니다. 적절하게 수정합니다.

    label_count=detect_labels(photo, bucket)
    print("Labels detected: " + str(label_count))

if __name__ == "__main__":
    main()

이제 작성한 python 스크립트를 CLI 환경에서 실행하여 S3 버킷에 업로드한 이미지를 차례로 분석해봅니다.

1) 남자 아이 이미지(child.jpg)

Boy(남자 아이), Clothing/Apparel(의류) 등의 키워드를 확인할 수 있습니다. 키워드 옆에 있는 숫자는 Confidence 항목으로 키워드에 대한 신뢰 수준을 의미합니다.

2) 개 이미지(dog.jpg)

놀랍게도 견종(골든 리트리버)까지 인식합니다. 분석에 사용된 이미지는 실제로 구글에 골든 리트리버로 검색한 결과 이미지 중 하나입니다.

3) 원피스를 입은 여성 이미지(onepiece.jpg)

이미지에 얼굴 전체가 보이지 않기 때문에 Human 키워드보다 Mobile Phone, Cell Phone 등의 키워드의 신뢰도가 더 높게 나왔습니다. 그 외에도 가구(Furniture)를 인식한 것을 확인할 수 있습니다.

4) 바다 사진(sea.jpg)

자연 풍경의 이미지로 테스트를 해봤습니다. Sky, Sea, Cloud 등 사진에 포함된 개체를 높은 신뢰도로 인식하였습니다.

5) 수영복 사진(swim.jpg)

아래에서 진행할 부적절한 콘텐츠 인식을 위해 사용할 수영복 사진입니다. Bikini, Swimwear 등 인물이 입고 있는 의상을 인식한 것을 확인할 수 있습니다.

3. python 스크립트(boto3)로 rekognition API 활용해보기(부적절한 컨텐츠를 포함하는 이미지 찾기)

스크립트 소스는 이전 이미지의 키워드를 찾기 위해 사용되는 detect_labels 함수가 부적절한 키워드를 찾기 위해 사용되는 detect_moderation_labels 로 변경된 것 외에는 크게 다르진 않습니다.

# S3 버킷에 업로드된 이미지를 분석하여 부적절한 이미지 감지 소스 
import boto3

def moderate_image(photo, bucket):

    client=boto3.client('rekognition')

    response = client.detect_moderation_labels(Image={'S3Object':{'Bucket':bucket,'Name':photo}})

    print('Detected labels for ' + photo)    
    for label in response['ModerationLabels']:
        print (label['Name'] + ' : ' + str(label['Confidence']))
        print (label['ParentName'])

    return len(response['ModerationLabels'])

def main():
    photo='upload-img/child.jpg' # 분석할 이미지의 키(경로+파일명)입니다. 분석할 이미지에 맞게 수정합니다.
    bucket='premisan-test' # 이미지가 저장된 S3 버킷의 이름입니다. 적절하게 수정합니다.
    label_count=moderate_image(photo, bucket)
    print("Labels detected: " + str(label_count))

if __name__ == "__main__":
    main()

스크립트의 내용을 5개 표본 이미지의 파일명으로 각각 수정하여 python 스크립트 실행 시 아래와 같은 결과가 나타납니다.
다른 4개 이미지는 부적절한 컨텐츠로 감지되지 않았지만 수영복을 입은 여성의 이미지는 몸을 드러내는 옷(Revealing Clothes), 여성의 수영복 혹은 속옷(Female Swimwear Or Underwear) 키워드에 의해 부적절한 컨텐츠로 감지된 것을 확인할 수 있습니다.

1) 남자 아이 이미지(child.jpg)

2) 개 이미지(dog.jpg)

3) 원피스를 입은 여성 이미지(onepiece.jpg)

4) 바다 사진(sea.jpg)

5) 수영복 사진(swim.jpg)

4. flask 기반 웹 페이지를 구성하여 rekognition API 활용

이제 rekognition API 사용 방법은 간단히 익혔으니 flask 프레임워크 기반으로 간단한 파일 업로드 html 페이지를 구성하고, 해당 웹 페이지를 통해 업로드되는 이미지를 rekognition API를 활용하여 감지 후 부적절하지 않은 이미지만 S3 버킷에 업로드하도록 테스트 환경을 구성해보도록 하겠습니다.

먼저 flask 모듈을 설치합니다. 테스트 환경은 CentOS7, yum 을 통해 설치한 python 3.6 환경입니다.

python -m pip install flask

S3 버킷에 이미지를 업로드하기 위해 임시로 파일을 저장해둘 디렉토리를 생성합니다. 업로드 디렉토리를 따로 만들지 않고 /tmp/ 경로에 임시로 파일을 저장해도 관계 없습니다.(해당 부분은 소스 코드에서 정의)

mkdir ./upload && chmod 777 ./upload # 퍼미션은 유저/환경에 따라 적절히 구성합니다.

이제 소스를 작성합니다. 작성할 파일은 app.py, templates/index.html 파일입니다.

# app.py 파일 소스 코드

import os
from flask import Flask, request, render_template
from werkzeug.utils import secure_filename
import boto3
from botocore.exceptions import ClientError

s3 = boto3.client('s3')
bucket_name = 'premisan-test' # 이미지가 부적절하지 않은 경우 이미지를 업로드할 S3 버킷의 이름입니다. 적절히 수정합니다.
app = Flask(__name__)

def moderate_image_local_file(filename, filedata):

    client=boto3.client('rekognition')

    response = client.detect_moderation_labels(Image={'Bytes': filedata})

    print('Detected labels for ' + filename)
    for label in response['ModerationLabels']:
        print (label['Name'] + ' : ' + str(label['Confidence']))
        print (label['ParentName'])

    return len(response['ModerationLabels'])

def upload_file(file_name, bucket_name, object_name):
    try:
        response = s3.upload_file(file_name, bucket_name, object_name)
    except ClientError as e:
        logging.error(e)
        return False
    return True

@app.route('/', methods=['GET'])
def index():
    return render_template('index.html')

@app.route('/upload', methods=['POST'])
def upload():
    file = request.files['file']

    filename = file.filename
    filedata = file.read()

    label_count=moderate_image_local_file(filename, filedata)
    print("Labels detected: " + str(label_count))

    if label_count == 0:
        path = os.path.dirname(os.path.realpath(__file__)) + '/upload'
        filename = secure_filename(file.filename)

        file.seek(0)
        file.save(os.path.join(path, filename))

        key = 'upload/' + filename # S3 버킷으로 이미지가 업로드될 키(경로 + 파일명)입니다. upload 부분을 원하는 S3 폴더 경로에 맞게 수정합니다.
        upload_file(os.path.join(path, filename), bucket_name, key) 

        message = 'The file "' + filename + '" was uploaded successfully'
    else:
        message = 'The file "' + filename + '" is an inappropriate image'

    return message

if __name__ == '__main__':
    app.run(host="0.0.0.0", port="5000", debug=True) # 접속을 위한 포트는 TCP/5000으로 설정했습니다. 환경에 따라 적절히 수정합니다.

# templates/index.html 파일 소스 코드

<html>
<body>

<form action="/upload" method="post" enctype="multipart/form-data">
<label for="file">Filename:</label>
<input type="file" name="file" id="file">
<input type="submit" name="submit" value="Submit">
</form>

</body>
</html>

이제 소스 구성까지 완료되었으니 flask 서버를 실행해봅니다.