Image Generation

Midjourney

While Midjourney aims to provide a creative tool for users, the service poses risks that span privacy, security, bias, and misuse. Mitigations are provided through moderation, subscription options, and prudent usage practices, though significant risks remain, especially for sensitive use cases. Overall, Midjourney should be used carefully and thoughtfully. For personal use, especially casual or recreational use, Midjourney can be an entertaining and inspiring creative aid. Report last updated: June 25, 2019 Profile last updated: July 13, 2023

Product Description

Midjourney's v5.2 model is a text-to-image model capable of generating images based on user prompts. The model is based on the latent diffusion paradigm. The model is accessed through the company's Discord server. Users can interact with the model by submitting image-generation prompts to a discord bot. Depending on the user's subscription, the user can interact with the bot in either publicly viewable Discord channels or in private direct messages submitted to the bot.

Midjourney, the company, "is an independent research lab exploring new mediums of thought and expanding the imaginative powers of the human species." The company was founded in 2021 [1] and first published a model in early 2022. According to the company's website, it is a self-funded venture.

Profile last updated: July 13, 2023

Intended Use Case

Midjourney is designed to be used to create or modify images -- art, marketing content, realizations of "what if" scenarios, and more -- from user prompts containing text and, in specific contexts, images. The service can also be used to generate textual descriptions of user-supplied images. Midjourney caters to a creative customer/user base rather than a specific industry.

‍

Risk and Mitigation Summary

The following table provides a quick summary of which common genAI-driven risks are posed through use of Midjourney and which risks have been addressed (but not necessarily eliminated) by deliberate mitigative measures provided with the product.

For a definition of each risk and details of how these risks arise in the context of Midjourney, see below. These risk details are non-exhaustive.

Risk	Present	Built-in Mitigation
Abuse & Misuse	⚠️	✅
Compliance	⚠️	✅
Environmental & Societal Impact	⚠️	❌
Explainability & Transparency	⚠️	✅
Fairness & Bias	⚠️	✅
Long-term & Existential Risk	-	N/A
Performance & Robustness	⚠️	✅
Privacy	⚠️	✅
Security	⚠️	❌

Abuse & Misuse

Pertains to the potential for AI systems to be used maliciously or irresponsibly, including for creating deepfakes, automated cyber attacks, or invasive surveillance systems. Abuse specifically denotes the intentional use of AI for harmful purposes.

Deepfakes

Midjourney can be used to generate highly realistic images of non-real events or scenarios [2, 3]. This includes depictions of public figures in settings that have not actually happened. This could be used to mislead the public.

Harmful images

Midjourney can be used to generate sensitive content, including depictions of violence and war, self-harm, pornography and other sexual content, and illicit activities. Midjourney lowers obstacles to creation of these types of material and can enable proliferation of harmful content elsewhere on the public spaces of the internet.
Midjourney aims to maintain a "PG-13" environment and explicitly states that this may not always be maintained due to the unpredictability of AI and user prompting patterns [6]

The Midjourney Discord service is subject to moderation and makes use of some prompt and content filters. See the Mitigations section below for more details.

Compliance

Involves the risk of AI systems violating laws, regulations, and ethical guidelines (including copyright risks). Non-compliance can lead to legal penalties, reputational damage, and loss of user trust.

Midjourney is trained on a vast collection of images from the open internet, including images subject to copyright protections in various jurisdictions. It is possible for users to generate content that could violate other entities' intellectual property rights using the Midjourney service. For instance, users could generate content using the Midjourney platform which incorporates depictions of a corporation's trademarked logos, mascots, or other assets [4].
Users operating in public Midjourney channels or with settings enabling publication to Midjourney.com make their prompts and responses available to other Midjourney users. While the user maintains ownership of the generated assets (according to [5]; this may not be true subject to the material containing copyrighted content from another party), other users are authorized by Midjourney to reuse and modify works that appear in public channels. This could affect the defensibility of patents or trademarks related to Midjourney-generated works.

Regulatory

The US Patent and Trademark Office recently issued guidance stating that AI-generated art is generally not eligible for copyright protection absent significant modification by a human. Furthermore, in most cases, only the human-derived portions of the art are eligible for protection [5].

Organizational compliance

Individuals using Midjourney for completing work tasks on behalf of an employer could craft prompts and obtain images that are outside of organizational policies. Midjourney provides no mechanism for establishing organizational control of employee usage. This could lead to divergence with brand guidelines or outright misuse (relative to organizational expectations).

Midjourney supports Digital Millennium Copyright Act (DMCA) claims made by third parties related to content generated by the service. See the Mitigations section for more details.

Midjourney has a "Stealth Mode" available for customers on its Pro plan. This mode blocks images from being publicly viewable on Midjourney.com. See Mitigations for more details.

Environmental & Societal Impact

Concerns the broader changes AI might induce in society, such as labor displacement, mental health impacts, or the implications of manipulative technologies like deepfakes. It also includes the environmental implications of AI, particularly the strain on natural resources and carbon emissions caused by training complex AI models, balanced against the potential for AI to help mitigate environmental issues.

Labor market disruption

Image generation models like Midjourney have the potential to make graphic design a substantially more efficient process. This could lead to an overall decrease in market demand for graphic design talent. Likewise, Midjourney's ease of use lowers the bar for creating quality content, which could lead organizations to shift design tasks from dedicated designers to workers who fill other roles, obviating the need for design-trained employees. This phenomenon could apply more broadly -- the ease and low cost of creating digital, AI-generated content could lead organizations to eschew other, higher cost forms of media, such as photography in favor of images generated by Midjourney and similar services.

Carbon footprint

Details about the size of Midjourney's models are unavailable. The Stable Diffusion V1 model, which is similar to Midjourney, was estimated to use 200,000 hours of compute on NVidia A100 40GB GPUs [7]. At 250W power consumption [8], this translates to 50MWh of compute, which is equivalent to the annual electricity consumption of about 5 U.S. households [9].
The footprint and power consumption associated with ongoing use are impossible to estimate without further details on Midjourney's user base, usage patterns, and software engineering infrastructure.

Proliferation of Content

The proliferation of AI generated content could lead to unpredictable outcomes in the broader internet. For instance, a proliferation of sexualized content, generated with low effort using Midjourney, across the internet could lead to changes in internet and non-internet culture and cultural norms. Even proliferation of content reasonably considered "non-harmful" could lead to changes in cultural expectations of digital art and expression. The level of risk associated with this scenario is uncertain: cultural change stemming from the proliferation of AI-generated images may ultimately be viewed favorably by present and future generations.

Explainability & Transparency

Refers to the ability to understand and interpret an AI system's decisions and actions, and the openness about the data used, algorithms employed, and decisions made. Lack of these elements can create risks of misuse, misinterpretation, and lack of accountability.

Data and design transparency

Information on the training data used to train Midjourney's model is not publicly available. Details on the model's architecture, training algorithm, and other key information related to the model's design are not publicly available.

Explainability of model outputs

As of the latest update, Midjourney v5.2, a Prompt Analyzer tool is available for explainability. The tool, invoked via the "/shorten" command, displays a score for each word in a user's prompt which characterizes the impact the word has on the output image. The details of how this tool works are unclear, likely relying on activations from the model's neural network layers.

The /shorten command represents a mitigation for the explainability risk. We do not discuss it further. No apparent mitigation exists for transparency issues we have identified.

Fairness & Bias

Arises from the potential for AI systems to make decisions that systematically disadvantage certain groups or individuals. Bias can stem from training data, algorithmic design, or deployment practices, leading to unfair outcomes and possible legal ramifications.

Multi-lingual support

Midjourney is believed to be trained on a large collection of image-text pairs from the internet (possibly in addition to other sources), meaning it is able to comprehend some prompts in languages other than English. Anecdotal evidence suggests obtaining quality results is more difficult when using languages other than English [10]. There is no quantification of this behavior, per-language nor in general.

Offensive or biased outputs

Midjourney is subject to the biases in its training data. As Midjourney is a fully private model, formal evaluations of Midjourney's latest model are unavailable. Nevertheless, researchers have identified biases in Midjourney's outputs consistent with biases found in other text-to-image generative models. For instance, in [11], the authors used Midjourney v3 to generate images of people with varying personality traits or demographic adjectives included in the prompts. They found the model perpetuated a variety of stereotypes. For instance, including the terms 'poor' or 'wealthy' in the prompt made it more likely that the model would generate images of people with darker skin tones and lighter skin tones, respectively, while including the terms 'competitive' or 'passive' made it more likely that the model would generate images of men and women, respectively.

Midjourney employs content filters to address the prevalence of offensive or sensitive outputs. See the Mitigations section for more details.

Long-term & Existential Risk

Considers the speculative risks posed by future advanced AI systems to human civilization, either through misuse or due to challenges in aligning their objectives with human values. N/A

Performance & Robustness

Pertains to the AI's ability to fulfill its intended purpose accurately and its resilience to perturbations, unusual inputs, or adverse situations. Failures of performance are fundamental to the AI system performing its function. Failures of robustness can lead to severe consequences, especially in critical applications.

Quality and Suitability of Outputs

Midjourney's outputs may not always match the intention of the user. Outputs may not be sufficiently realistic or may have styles mismatched to the user's intentions. It is difficult for third-party organizations to quantify this phenomenon. Because prompts can be easily adjusted and re-tried by users, the primary risk associated with performance is financial and time cost: a user may have to spend substantial time, effort, and money (depending on the user's subscription to the service) to obtain an acceptable image.

Midjourney's /shorten command, which provides explanations linking words in a prompt to an output, can aid users in obtaining acceptable images with less effort. This, along with developing strong prompt engineering skills, represents the primary mitigation. We do not discuss the /shorten command further. Please see the Mitigations section for more on prompt engineering.

Privacy

Refers to the risk of AI infringing upon individuals' rights to privacy, through the data they collect, how they process that data, or the conclusions they draw.

Publication of prompts and images

Midjourney, by default, operates in public Discord channels. User prompts and the Midjourney bot's responses are viewable by any member of the public Discord channel. Other users are entitled to use, re-use, modify, and share prompts and outputs from public servers. In addition, prompts and outputs may be re-posted to Midjourney's website, midjourney.com. Non-paid users do not retain ownership of images created using the Midjourney service [6].
For paying users, some privacy-preserving features are available, which obfuscate prompts and outputs from the public Discord servers and/or from midjourney.com [6].

Retraining

Midjourney's Terms of Service authorize the company to "reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute text, and image prompts," submitted to the service [6]. Among other things, this enables Midjourney to use prompts and outputs for training future versions of its models. If used for sensitive tasks, such as confidential corporate design work, this provision could lead to leaks of the sensitive data to other parties, including individuals or companies unaffiliated with Midjourney.

Midjourney's Pro subscription, with 'Stealth Mode' enabled, mitigates the risk of immediate leakage of sensitive information. Midjourney's Privacy Policy and Terms of Service make perpetual downstream control of sensitive prompt and output data impossible. We discuss this topic further in the Mitigations section.

Security

Encompasses potential vulnerabilities in AI systems that could compromise their integrity, availability, or confidentiality. Security breaches could result in significant harm, from incorrect decision-making to privacy violations.

Model sequestration

At this time, Midjourney does not offer "sequestered" (e.g., virtual private cloud or on-premises) access to their models. This can lead to loss of data and presents a reliance on Midjourney's service. Enterprise Service-Level Agreements are not available.

Reliance on Discord

Because Midjourney is accessed through the Discord service, users are subject to the security risk surface presented by Discord's software systems.

‍

Mitigation Measures

In this section, we discuss mitigation measures that are built-in to the product (regardless of whether they are enabled by default). We also comment on the feasibility of a procuring organization governing the use of the tool by its employees.

Mitigations that "ship" with the service and model

Moderation

Midjourney maintains Community Guidelines with the goal of achieving "PG-13" appropriateness [15]. According to the company, some prompts are blocked automatically. The details and strength of this moderation are not publicly available. The automated moderation targeted at "NSFW" (not safe for work) content; Credo AI believes, but cannot confirm, that the moderation tool is unlikely to identify or block subtler risk issues, such as the perpetuation of stereotypes (e.g. associating some professions with people presenting as a particular gender or ethnic background).

Asset ownership provisions and DMCA

Midjourney maintains a process for individuals to make takedown requests associated with suspected violation of their intellectual property rights under the Digital Millennium Copyright Act (DMCA). This mitigation primarily applies to individuals whose works have been identified in Midjourney outputs (with or without modification) and does not represent a mitigation for users of Midjourney hoping to avoid infringing on others' intellectual property rights.

Mitigations available through agreement or paid subscription

Stealth Mode and Direct Messaging

Paid users have the ability to use the Midjourney service through a private direct message chat with the Midjourney bot on the company's Discord server. This obfuscates prompts and outputs from other Midjourney users. Midjourney maintains the right, however, to post prompts and outputs on their website, meaning this mitigation has limited scope.
Users on "Pro" plans have access to Midjourney's "Stealth Mode" feature, which communicates to the company that prompts and outputs should not be published to the Midjourney website. Midjourney's Terms of Service uses non-committal language regarding whether Stealth Mode requests will be honored: "we agree to make best efforts not to publish any Assets You make in any situation where you have engaged stealth mode in the Services." [6]

Mitigations that can be implemented through customized use of the service

Prompt Engineering

As with any text-input generative AI model, prompt engineering can play a significant role in the quality and appropriateness of outputs. The effectiveness of any prompt engineering strategy is difficult to objectify or quantify. Several guides have been published on the internet recently, e.g., [16, 17, 18].

Private Discord Server

Midjourney's option for users to integrate its bot into private Discord services can help mitigate privacy and content-related risks. Discord supports various forms of moderation, tailored to a server administrator's needs. Some of these moderation techniques rely on OpenAI's modeling capabilities, which pose their own risks.

Governability

For an organization to govern its development or use of an AI system, two functionalities are key: the ability of the organization to observe usage patterns among its employees and the ability of the organization to implement and configure controls to mitigate risk. Credo AI assesses systems on these two dimensions. Midjourney allows paying users to add its model & bot to private Discord servers. This enables organizations to track prompts and output images submitted and received by employees operating within the Discord server. Organizations have the ability to implement further mitigation measures that leverage this visibility. For instance, an organization could make use of Discord's AutoMod feature [12] to set and enforce a set of rules pertaining to usage of the Midjourney bot.

‍

Formal Evaluations & Certifications

Evaluations

As discussed previously, Midjourney uses a closed model. No formal evaluations of the model's performance and capabilities are publicly available from the company. Comparisons between Midjourney and competitor models, such as Stability AI's Stable Diffusion model and OpenAI's Dall-E 2 model, are typically anecdotal and stylistic -- a user preferring one model over another due to the tendency of the model's outputs to match the particular "flavor" of image the user is trying to create. Midjourney, anecdotally, excels at photorealism [13].

Some research has been performed to study misbehavior (e.g. tendency to produce offensive or inappropriate content), such as [10, 11]. This research is limited by the fact that access to the model is gated. The results of these research efforts are generally qualitative in nature -- [11] finds evidence of stereotypical behavior for some character or demographic traits but not for others. Credo AI was unable to find evidence of general quantitative statements, such as rates of misbehavior for different topics or languages. This type of result would be difficult to obtain, owing to the size of the input space: all written languages. More research is necessary.

Certifications

Credo AI has identified the following regulations and standards as relevant to the privacy, security, and compliance requirements of our customers. Midjourney's compliance is detailed below:

‍

Conclusion

Midjourney is an AI-powered text-to-image generation service that enables users to generate digital art and content. While Midjourney aims to provide a creative tool for users, the service poses risks that span privacy, security, bias, and misuse. Mitigations are provided through moderation, subscription options, and prudent usage practices, though significant risks remain, especially for sensitive use cases.

Overall, Midjourney should be used carefully and thoughtfully. For personal use, especially casual or recreational use, Midjourney can be an entertaining and inspiring creative aid. However, for professional use, especially in regulated industries or for the creation of business-critical assets, the risks posed should be weighed carefully against the rewards. The AI field is progressing rapidly, and services like Midjourney will continue to improve, but AI-based tools demand close monitoring and governance to be used responsibly.

‍

References

[1] Midjourney Founder David Holz On The Impact Of AI On Art, Imagination And The Creative Economy - https://www.forbes.com/sites/robsalkowitz/2022/09/16/midjourney-founder-david-holz-on-the-impact-of-ai-on-art-imagination-and-the-creative-economy/?sh=304765122d2b

[2] Fake images of Trump arrest show 'giant step' for AI's disruptive power - https://www.washingtonpost.com/politics/2023/03/22/trump-arrest-deepfakes/

[3] AI Deep Fake of the Pope’s Puffy Coat Shows the Power of the Human Mind - https://www.bloomberg.com/news/newsletters/2023-04-06/pope-francis-white-puffer-coat-ai-image-sparks-deep-fake-concerns

[4] Artists fed up with AI-image generators use Mickey Mouse to goad copyright lawsuits - https://www.dailydot.com/debug/ai-art-protest-disney-characters-mickey-mouse/

[5] Copyright Registration Guidance: Works Containing Material Generated by Artificial Intelligence - https://www.federalregister.gov/documents/2023/03/16/2023-05321/copyright-registration-guidance-works-containing-material-generated-by-artificial-intelligence#print

[6] Midjourney Terms of Service - https://docs.midjourney.com/docs/terms-of-service

[7] Stable Diffusion V2 Base: HuggingFace - https://huggingface.co/stabilityai/stable-diffusion-2-base

[8] PNY NVidia A100 - https://www.pny.com/nvidia-a100

[9] U.S. Energy Information Administration - https://www.eia.gov/tools/faqs/faq.php?id=97&t=3

[10] Midjourney tested in foreign languages - https://philippstelzel.medium.com/midjourney-tested-in-foreign-languages-ac60053bcadb

[11] A Friendly Face: Do Text-to-Image Systems Rely on Stereotypes when the Input is Under-Specified? - https://arxiv.org/pdf/2302.07159.pdf

[12] Discord AutoMod FAQ - https://support.discord.com/hc/en-us/articles/4421269296535-AutoMod-FAQ

[13] Android Authority: Midjourney vs Stable Diffusion: Which AI image generator should you use? -- https://www.androidauthority.com/midjourney-vs-stable-diffusion-3327349/

[14] Midjourney Privacy Policy - https://docs.midjourney.com/docs/privacy-policy

[15] Midjourney Community Guidelines - https://docs.midjourney.com/docs/community-guidelines

[16] An advanced guide to writing prompts for Midjourney (text-to-image) - https://medium.com/mlearning-ai/an-advanced-guide-to-writing-prompts-for-midjourney-text-to-image-aa12a1e33b6

[17] Midjourney Prompt Guide - https://docs.midjourney.com/docs/prompts

[18] Master Midjourney Prompts: A Guide to creating creative Midjourney Prompts with Chat GPT Assistance - https://uxplanet.org/midjourney-prompt-82d0bfef7b99

Notes

Italics denote Credo AI definitions of key concepts.

AI Disclosure: The Conclusion section of this report was generated with assistance from Anthropic's Claude model. The other sections of the report were provided to Claude and Claude was prompted to write a 2-3 paragraph conclusion section. The final text was edited and reviewed for accuracy and suitability by Credo AI.

‍