Attributable Watermarking of Speech Generative Models

168441-Thumbnail Image.png
Description
Generative models in various domain such as images, speeches, and videos are beingdeveloped actively over the last decades and recent deep generative models are now capable of synthesizing multimedia contents are difficult to be distinguishable from authentic contents. Such capabilities cause concerns

Generative models in various domain such as images, speeches, and videos are beingdeveloped actively over the last decades and recent deep generative models are now capable of synthesizing multimedia contents are difficult to be distinguishable from authentic contents. Such capabilities cause concerns such as malicious impersonation, Intellectual property theft(IP theft) and copyright infringement. One method to solve these threats is to embedded attributable watermarking in synthesized contents so that user can identify the user-end models where the contents are generated from. This paper investigates a solution for model attribution, i.e., the classification of synthetic contents by their source models via watermarks embedded in the contents. Existing studies showed the feasibility of model attribution in the image domain and tradeoff between attribution accuracy and generation quality under the various adversarial attacks but not in speech domain. This work discuss the feasibility of model attribution in different domain and algorithmic improvements for generating user-end speech models that empirically achieve high accuracy of attribution while maintaining high generation quality. Lastly, several experiments are conducted show the tradeoff between attributability and generation quality under a variety of attacks on generated speech signals attempting to remove the watermarks.
Date Created
2021
Agent