Design of Ethical AI Systems

12 min read1 day ago

by Pradeep B. Deshpande and Sanjeev S. Tambe

Abstract

An ethical design of artificial intelligence (AI) systems is presented. The proposed system comprising large language models (LLMs) is based on the premise that an animate or inanimate system, which exhibits high S, R, T and emotional excellence (EE) scores is less apt to engage in actions that are detrimental to humanity. To design such a system, a large volume of data from various fields are collected whose S, R, T and EE scores are manually estimated and then used to train a large language model (LLM) to serve as “Target” LLM. The outputs of the “Target” LLM would be used to estimate the S, R, T and EE scores of the textural outputs of the traditional transformer based LLM that is used for example in developing Chatbots like ChatGPT by OpenAI and Gemini by Google. Such an analysis will allow the user to assess if the textural output of the transformer LLM is safe for humanity and take appropriate action.

What is the Problem

The performance of current AI systems has far exceeded the wildest expectations of AI pioneers. AI experts marvel at their performance but are puzzled by how exactly AI systems find what they find, do what they do.

A positive consequence of the rapid progress in AI is that the solutions to myriads of serious problems facing humanity could soon be found. Some examples include the discovery of new medicines, early diagnosis of serious diseases, climate change, renewable energy, among numerous others.

But there is a significant downside. Although these solutions will significantly add to the wealth of nations, if not managed properly, they can dramatically increase the wealth gap between the rich and the poor leading to societal unrest. Furthermore, there is no guarantee that AI systems will always work in the best interest of humanity. Some AI experts have gone so far as to suggest that AI could pose an existential threat to the very survival of humanity. See this video clip, What AI Experts Have Said Are the Dangers from AI.

This article proposes an AI system design that is intended to mitigate the dangers from AI while not interfering with its positive attributes.

How Artificial Intelligence and Reason are Linked

Humans have been adding to the knowledge-base of humanity ever since they began to walk upright, and subsequently, formed societies. The best of the best AI systems are trained using this entire knowledge-base of humanity and they have gotten rather good at it. Soon, they could surpass the combined capabilities of all humans, if they haven’t already.

The knowledge-base of humanity resides in the “domain of reason” that is accessible to the five senses and the rational mind. New discoveries can and have been made by connecting the dots in this vast ocean of existing knowledge. The new discoveries once made and corroborated add to the knowledge-base of humanity and become available to connect more dots for additional discoveries. Such discoveries are called Type I discoveries. AI has shown itself to be exceptionally good at connecting the dots.

This means if the answer to a query is in the knowledge-base of humanity or can be found by connecting the dots in the ocean of existing knowledge-base in ingenious ways, modern AI systems will find it. If not, they won’t.

Just like there is a “domain of reason” where the existing knowledge of humanity resides, there is a realm beyond reason where all data, information, and facts reside. In Mahayana Buddhism, this domain is called the Indra’s Net. Let’s coin the term “domain of consciousness” for this realm.

New discoveries can and have been made by connecting to the domain of consciousness since ancient times. We have coined the term, “Type II discoveries” for such discoveries. In Sanskrit, Type II discoveries go by the name, “Shruti” meaning “revealed.” Type II discoveries cannot be made from existing knowledge, and therefore, they are beyond the reach of AI systems.

To make Type II discoveries, it is necessary to transcend the domain of reason and connect to the domain of consciousness. One does so by enhancing the focus of attention as with meditation or prayer.

Success is also accompanied by a tremendous rise in internal/emotional excellence. Fortunately, past and present yogis and evolved human beings who have so connected, have kept the details of how to connect to the domain of consciousness a secret, for if they had not, they would have become a part of the knowledgebase of humanity, and therefore, become accessible to AI systems to learn from and possibly misuse.

This is not to suggest that scientists and others have not or will not connect to the domain of consciousness and make Type II discoveries. Human progress since the renaissance period is mainly due to the scientific discoveries in the West, first in Great Britain and Europe and then in America, and most Nobel Prizes continue to go to Americans. Renowned scientists like Albert Einstein and numerous others must have enhanced their focus of attention to such an extent that they connected to the domain of consciousness, albeit unknowingly, and breakthrough discoveries came to them.

Problem Characterization

With this background, the problem before us may be characterized as falling into four parts:

(1) Part I: Developers and others should want to enhance their own internal/emotional excellence and emerge as better human beings so there would be fewer prospects of producing AI systems that can cause harm. Several articles listed under Further Reading explain the wherewithal of how to rise in internal/emotional excellence and derive a myriad of benefits from success.

(2) Part II: Build the necessary internal/emotional excellence elements into the AI system design so harmful outcomes can be precluded at the source. This is the subject of this article,

(3) Part III. Governments should adopt an internationally acceptable regulatory framework to serve as guardrails against harmful AI outcomes as several prominent AI researchers have urged, and

(4) Part IV. Further development of AI systems should be stopped before Artificial General Intelligence (AGI) levels are reached as some AI experts have argued.

Enhancing Internal/ Emotional Excellence (EE)

According to Saamkhya philosophy, all of creation comprises five principal elements and three Gunas (S, R, T). The principal elements are: Prathvi, Jal, Agni, Vayu and Akash (consciousness and energy).

Sir J. C. Bose, FRS, had proved that metals sense stress and plants feel pain.

The three Gunas are the following:
S (Sattva): Purity, harmony, truthfulness, honesty, steadfastness, equanimity, balance, clarity, contentment, calmness, selflessness, wisdom, positivity.

R (Rajas): Activity, passion, desire, bravery, ambition, ego, greed, energy, dynamism, restlessness, excitement, overactivity, attachment to outcomes, drive, change and action, dissatisfaction, leadership.

T (Tamas): Inertia, ignorance, darkness, lying, cheating, causing injury in words or deed, confusion, procrastination, negligence, resistance to change, stagnation, lack of growth.

The S, R, T components are correlated with the two human emotions.

Positive Emotions: Unconditional love, kindness, empathy, compassion.

Negative Emotions: Anger, hatred, hostility, resentment, frustration, jealousy, fear, sorrow and the like.

The pursuit of higher levels of internal/emotional excellence is not an intellectual exercise. The required positive changes must come from within. For this, it is necessary to transcend reason, even if it is short of connecting to the domain of consciousness, and one does so by enhancing the focus of attention as with meditation. Emotions can be measured, and so progress can be audited.

These ideas will make for better human beings who are less apt to produce AI systems harmful to humanity.

Building Internal/Emotional Excellence into AI Systems

Large language models can perform a variety of tasks such as translation, generating textural material such as stories, poems, articles etc., summarizing a piece of text, and writing computer codes. The proposed AI system analysis would be limited to textual material only.

LLMs, such as GPT and some other Transformer architectures, can perform ‘Sentiment Analysis.’ The sentiment analysis determines the emotional tone or sentiment expressed in a piece of text. The analysis typically involves categorizing the text into predefined sentiment labels such as: “Positive”, “Negative” and “Neutral”. This analysis is useful in various applications, such as understanding customer feedback, analyzing social media posts, or monitoring brand reputation. The AI system design we propose here takes these ideas further by including S, R, T and EE concepts in the design phase itself.

AI system to Predict SRT and EE Scores of Textual Materials

In the proposed AI system, a “Target” LLM would perform the S, R, T and EE (emotional excellence) analysis of the text generated by the base LLM (like Llama by Meta).

The challenging part of developing LLM-based S, R, T and EE analyzer is in building the training dataset and actual training of the ‘Target’ LLM. An important step in this task is an understanding of the Gunas namely, S, R, and T and emotional excellence, EE attributes.

Components of the Ethical AI System Design

Step 1: Dataset for Target LLM Training:

To train Target LLM, it is necessary to collect textual samples and label them with S, R, T and EE scores. For this purpose, the training set must be exhaustive, containing tens of thousands of sentences and paragraphs and their annotations (labels) consisting of S, R, T and EE scores. Texts from philosophy, literature, news articles, blogs, conversations, psychological studies etc., need to be gathered after which the collected text is appropriately annotated with S, R, T and EE scores.

Human judgment assumes importance in annotation. Here, the annotators, preferably with an understanding of the SRT and EE concepts, need to read each text sample and assign S, R, T and EE scores to it. This is the most labor-intensive step. There are various ways annotation can be carried out. In one method the scores can be computed in 0.0 to 1.0 scale depending on how strongly a specific attribute(s) of S, R, T and EE are present in the input text of the model. Some examples of this type of scoring are given below.

Examples:

· Text: "She remained calm and composed in adversity."

Scores: S=1.0, R=0.0, T=0.0, EE= 1.0

· Text: “Rana Pratap showed outstanding bravery and leadership in the battle of Haladighat.

Scores: S=0.0, R=1.0, T=0.0, EE= 1.0

· Text: Ravan was a lusty, brave and ambitious king.

Scores: S=0.0, R=0.5, T=1.0, EE= 0.0

Step 2: Development of Target LLM

Model Selection. It is possible to use a smaller, more efficient transformer model for the “Target” network, as its task is more specific than general language understanding. Smaller and faster models such as BERT, DistilBERT, or RoBERTa can be considered rather than large models like Llama.

Training Procedure. The training procedure comprises the following steps:

· Objective Function: Use a loss function such as Mean Squared Error (MSE) to minimize the difference between predicted and true SRT and EE scores.

· Training Steps (training algorithm resource: PyTorch or TensorFlow.)

§ Fine-tune the Target model on the annotated dataset. The model will learn to associate textural features with SRT and EE scores.

§ Use techniques like data augmentation (e.g., paraphrasing) to increase robustness.

§ Evaluate on a held-out validation set.

· Example Output: The model, for example, outputs four scores (S=0.7, R=0.2, T=0.1, EE= 0.1) for an input text.

Validation Metrics

§ Use metrics like ‘Mean Absolute Error (MAE) or correlation with human annotations to evaluate the model.

Step 3: Using Target LLM with Base LLM

This step involves combining the Target LLM with a full-fledged base LLM:

Prompt Engineering

§ Use a pre-trained base LLM (e.g., LLaMA or Falcon) to generate textural material based on a user prompt.

§ Example Prompt: “Generate a story about courage, leadership and self-control.”

Pipeline Integration

§ Pass the generated text to the Target network for SRT and EE analysis.

§ Example Output (Generated Text): “In the face of challenges, he remained tranquil and wise.”

§ Analysis, for example, by the Target network: S= 0.9, R= 0.1, T= 0.0, EE= 1.0

Feedback Loop

§ Optionally, use the Target network’s analysis as feedback to refine the generated text.

§ Example: Adjust the prompt to emphasize or de-emphasize specific Gunas.

Step 4: Deployment

Model Hosting

§ Deploy both the base LLM and Target network on a cloud platform or local server.

User Interface (UI)

§ Create a user-friendly front-end to input prompts, generate text, and display SRT and EE analysis.

Example UI flow:

§ User enters a prompt asking for SRT and EE analysis.

§ Base LLM generates a response.

§ Target network analyzes and scores the response.

Step 5: Continuous Improvement

Fine-Tune Target Network

§ Periodically, refine the Target network with additional annotated data.

Expand Base LLM

§ Incorporate fine-tuning of the base LLM to align its outputs more closely with Guna-specific traits.

User Feedback

§ Gather user feedback to improve scoring accuracy and generated content quality.

Challenges and Considerations
Cultural and Interpretative Variance: The interpretation of S, R, T and EE can vary widely; It is important to ensure that the dataset and model reflect a consensus.

§ Bias: Monitor for and mitigate biases in both the base LLM’s text generation and the Target network’s analysis.

§ Scalability: Ensure the system can handle concurrent users without performance degradation.

In Closing

This article has attempted to show how an ethical AI system can be designed and built. The ethical AI system design will consist of two model networks: (1) The usual transformer neural network architecture used in large language modeling and (2) a new, smaller “Target” LLM that would be trained to evaluate S, R, T and EE scores of the textural material comprising sentences and paragraphs generated by the first LLM. The user will use the product of this analysis to decide if the output of the base network is harmful, neutral, or helpful and take appropriate action.

Further Reading

1. Sanjeev S. Tambe, B. D. Kulkarni and Pradeep B. Deshpande, Elements of Artificial Neural Networks with Applications in Chemical Engineering, Chemical and Biological Science, SAC, 1996.

2. Deshpande, Pradeep B., Six sigma could change the world, The Economic Times, September 18, 2009.

3. Pradeep B. Deshpande and James P. Kowall, The Nature of Ultimate Reality and How It Can Transform Our World: evidence from modern physics; wisdom of YODA, amazon 2015.

4. Pradeep B. Deshpande, Six Sigma for Karma Capitalism, amazon, 2015.

5. Pradeep B. Deshpande, Turbocharge General Education Requirements with Science of External and Internal Excellence, ASEE Conference, June 2017.

6. Pradeep B. Deshpande, Steve Jobs — Yogananda Paramahansa Link Unraveled, Siliconeer, June 9, 2018.

7. Pradeep B. Deshpande, Scientific Framework for World Transformation, Dialogue & Alliance, Universal Peace Foundation, 33, 1, 23 June 2019.

8. Pradeep B. Deshpande, The Secret of Exemplary Performance, Insights, Association to Advance Collegiate Schools of Business, August 2019.

9. Pradeep B. Deshpande, Decoding the Significance of Indian American Heritage, India Currents Magazine, April 19, 2021.

10. Deshpande, Pradeep B., Decoding the Wisdom of Swami Vivekananda, News India Times, April 5, 2023.

11. Pradeep B. Deshpande, Sanjeev A. Aroskar, Gautam Gupta, Mahendra Sunkara and James P. Kowall, How Discoveries Come About and How They Can Be Accentuated, News India Times, May 3, 2023.

12. Pradeep B. Deshpande, Transforming Higher Education, Higher Education Digest, September 7, 2023.

13. Pradeep B. Deshpande and James P. Kowall, Sanatan Dharma: One World, One Family, News India Times, September 14, 2023.

14. Pradeep B. Deshpande, Sanatan Dharma, Casteism and Hinduism, News India Times, September 21, 2023.

15. Pradeep B. Deshpande, Accelerating the Rise of India, New India Abroad, February 24, 2024.

16. Pradeep B. Deshpande, A Chemical Engineer Learns New Lessons from the Bhagvad Geeta, India Currents, March 10, 2024.

17. Pradeep B. Deshpande and James P Kowall, The Life Force Of ‘Prana’ In Our First Breath To Our Last, India Currents, April 21, 2024.

18. Pradeep B. Deshpande, Renewable Energy Need Not Remain an Enigma, a lotus in the mud magazine, August 18, 2024.

19. Pradeep B. Deshpande, Why US India Relationship is one of the Most Consequential for the Twenty First Century, News India Times, August 23, 2024.

20. Pradeep B. Deshpande, Six Sigma and spirituality: bridging performance and emotional excellence, A Lotus in the Mud Magazine, September 17, 2024.

21. Pradeep B. Deshpande, Is Emotional Excellence is the 14th Key? New India Abroad, October 6, 2024.

22. Pradeep B. Deshpande, The Mystical Nature of Intuition, New India Abroad, November 24, 2024.

23. Pradeep B. Deshpande, Bridging the American Political Divide, YouTube, December 10, 2024.

24. Pradeep B. Deshpande, Mumbai’s Dabbawalas: The Secret to Six Sigma Success, YouTube, December 12, 2024.

25. Pradeep B. Deshpande Transforming Business: The Power of Emotional Excellence, December 12, 2024.

26. Pradeep B. Deshpande and Sanjeev S. Tambe, Can AI App mitigate the risks of AI taking over humanity, New India Abroad, December 15, 2024.

27. Pradeep B. Deshpande, AI Experts Sound the Alarm: The Dangers from Artificial Intelligence, December 20, 2024

28. Pradeep B. Deshpande and Sanjeev S. Tambe, The Significance of the World Meditation Day — December 21st, New India Abroad, December 21, 2024.

29. Pradeep B. Deshpande, How Indian Americans Can Emerge as a Force for Good, New India Abroad, December 29, 2029.

About the Authors

________________

Pradeep B. Deshpande is professor emeritus and former Chairman of the Chemical Engineering Department at the University of Louisville.

Sanjeev S. Tambe is a former Chief scientist and head of Chemical Engineering and Process Development Division at the National Chemical Laboratory, Pune, India.