Focusing our “Attention” to Generative AI

May 3, 2024

In the realm of AI-generated content, the fusion of different modalities like images and text has opened up exciting possibilities. For instance, now we can generate images from textual inputs using Diffusion models or even realistic minute logn videos using SORA from OpenAI. However, none of the implementations, to the best of our knowledge, work well with abstract elements like lyrics. For instance, the output generated by Stable Diffusion 2 on a metal genre music just takes the textual input as is.

Prompt and the output generated by Stable Diffusion 2, with only selective attention to given to certain words [Source: Author]

Hence, our initial surmise is that state-of-the-art models aren’t catered to capture the abstractness present in music and it’s lyrics yet. This is exactly what Blitzfa is openly propelling towards!

Recently, we introduced the LAG dataset, a rich resource containing music features paired with lyrics and genre information. Leveraging this dataset, we strived to train various multimodal architectures right out-of-the-box and futher tweaked them upon analysing the outcomes. One such multimodal architecture we examined was VQGAN-CLIP.

Giving a gist of VQGAN-CLIP

VQGAN-CLIP is a powerful combination of generative and discriminative AI wherein the VQGAN component of the model is responsible of generating an image from latent space (mostly Gaussian noise or the encoded output of a VAE). The image generated is then encoded by CLIP to compare the distance between the image embeddings and the embeddings of the input text provided by the user (i.e. prompt), giving us contrastive loss.

VGQAN-CLIP Traditional Diagram + Custom losses. NOTE: This isn’t our actual architecture. Our actual architecture is currently in progress. However the above image gives a fairly rough overview.

While we are in the process of experimentation, we observed something quite interesting that was worth sharing with the community online and that’s the focus of this blog.

Something intriguing…

One of the pivotal experiments we conducted involved integrating our own attention mechanisms into the architecture to create a more refined output. We came about this hypothesis because the outputs we generated had a tinge of animated feel, taking away the wrath of metal music or the bop-energy of pop music.

Our architecture when cross-pollinated with different genre input leads to nuanced outputs but, it’s still missing something.

By incorporating attention layers into the input prompt, we aimed to capture the nuances and context embedded within the lyrics. We experimented with both multi-head and Bahdanau attention mechanisms, each offering unique insights into the interplay between words and their relevance.

Upon training the attention layers and then inferencing with the same input parameters, the outputs were interesting to look at.

The image generated after adding attention added more granularity to the output.

As we can see, the output generated is a less cartoon-ish right now. We can see the sand grains. Moreover, it also encapsulates the burried deep down part quite well. Upon closer inspection, we can notice there’s a shovel like object at the top left which it generated poorly.

This output was interesting to analyze as none of this was mentioned in the input. The entire essence of the lyric and it’s corresponding music snippet corresponds to a scene of statement in any superhero movie. The audio evokes a strange emotion of defeating anyone in your path (at least for me). The entire point of audio-based generation is that it’s subjective, which makes it quite challenging. Hence, from the above experiment, we understood that using a custom attention layer was necessary for adding a touch of “reality” to the output. The attention layers were trained with CLIP LOSS in a way that it gives more weightage to the genre. The following diagram should clear things up for you.

Custom Attention Layers, when trained properly, can be used to chain different layers of complexity to your output.

While this is just the tip of the iceberg of what we have found, we’ll be releasing more of our findings as we go forward. If you’d like to keep in touch with our journey, please subscribe to our Medium to be notified when our next blog comes up.

Authors

Back to blog

Return and Refund Policy

What is IndusVale's Return and Exchange Policy? How does it work?

At IndusVale, we want you to be completely satisfied with your purchase. If for any reason you are not satisfied, we offer a hassle-free return and refund policy. Please read the following guidelines carefully to ensure a smooth process:

Eligibility for Returns:

All items to be returned or exchanged must be unused and in their original condition with all original tags and packaging intact (for e.g. T-Shirts must be packed in the original packaging).
Return will be processed only if the product is not different from what was shipped to you
Return will be processed only if it is determined that the product was not damaged while in your possession
Returns must be initiated within 7 days from the date of purchase.

Return Process:

To initiate a return, please contact our customer support team by email at ecommerce@blitzfa.org within the specified return period. Alternatively, you can also visit our website and submit your query on our Contact Page.
Provide your order details, including the order number, item(s) you wish to return, and the reason for the return.
Our customer support team will provide you with a Return Merchandise Authorization (RMA) number and detailed instructions for returning the item(s).
Inspection and Refund Process:
Once we receive the returned item(s), our team will inspect them to ensure they meet the eligibility criteria stated above.
If the returned item(s) are in satisfactory condition, we will process the refund.
Refunds will be issued to the original payment method used for the purchase.
Please allow a reasonable processing time for the refund to be reflected in your account, which may vary depending on your financial institution.

Under Exchange Policy

If you choose to exchange the item purchased from IndusVale within the specified exchange period for the same size or a different size of the same style, you will be provided with a free replacement of the item.
The applicable refund for an exchange will be processed after the successful pickup of the original item from you.
Exchanges are only available for pin codes that are serviceable for an exchange by IndusVale.
You can only select a single item for exchange. While you can exchange multiple items, each item needs to be initiated as a separate exchange request.
Non-returnable products/categories cannot be exchanged.
IndusVale reserves the right to restrict exchanges of items purchased if the customer breaches or misuses this policy, as determined in IndusVale's sole discretion. In case you have purchased an item that has a free gift/offer associated with it and you wish to return the main item, you will need to return the free product as well.
IndusVale will not be held liable for products returned by mistake. If an extra or different product is returned by mistake, IndusVale is not responsible for its misplacement, replacement, or delivery back to the customer.

Why have I not received my Refund despite Instant Refunds policy?

For refunds taken into source accounts via UPI & Wallet, your refund will reflect instantly (48hrs in case of delay). For refunds taken to source accounts (that is Credit Card, Debit Card and Netbanking), your refund may take 7-10 days to reflect in your account depending on your banking partner.
How long would it take me to receive the refund of the returned product?
After the refund has been initiated by IndusVale in accordance with the Returns Policy, the refund amount is expected to reflect in the customer's account within 5 to 10 business days.
Please note that IndusVale initiates the refund process once the products are received, and the quality check is successfully completed. As a result, the time taken for the refund initiation may vary based on the courier partner's delivery time to an IndusVale warehouse. If there are any discrepancies in the refund, IndusVale may, at its sole discretion, request you to provide a screenshot of your bank statement.

Does IndusVale pick up the product I want to return from my location?

Currently, we pick up products only from selected PIN Codes. If your area PIN Code is serviceable, you will be able to select the pickup option when you create a Return Request on Website.
We will pick up the return within 4 - 7 days from the request placement date.
Please keep the return shipment ready along with all the original tags and package.

Why has my return request been declined?

This may have happened, if the item you returned is used, damaged or original tags are missing. In the event that the return request is declined, the user shall not be eligible for a refund, and IndusVale assumes no liability in this regard. For more details, please write an email to our customer care team at ecommerce@blitzfa.org

Why did the pick up of my product fail?

We make three attempts to pick up the item, if the item is not picked up in the third attempt, the Pick-up request will be marked as failed. You can initiate a new return request, if the item meets the return criteria and is within the specified return/exchange period.

Payments

How can I pay for my orders at IndusVale?

We support the following payment options at IndusVale:

Cash On Delivery (available only for select PIN Codes)
Credit Card
Debit Card
Net banking
Wallet

How does the COD (Cash on Delivery) payment option work?

IndusVale's Cash on Delivery option allows you to pay order value at the time of delivery for all orders between Rs. 999 and Rs. 2999. To pay for any order using Cash on Delivery (COD) mode of payment, please select the 'Cash On Delivery' option on the payment page. Cash on Delivery option is available only in selected pincodes.

What should I do if my payment fails?

Please retry making the payment after ensuring that the information entered is accurate, including all account details, billing addresses and passwords. If your payment still fails, you can use the Cash on Delivery (COD) payment option, if available on the payment page to place your order. If your payment is debited from your account after a payment failure, it will be credited back within 7-10 days, after we receive a confirmation from the bank.

I am being charged GST amount on my order. What is GST?

Once an order has been dispatched, it cannot be cancelled or modified. In such cases, our GST is a single tax on the supply of goods and services that is levied on every value addition (through production and services) and is added to a product's sale price. GST has to borne/paid by the ultimate consumer of the product or service. GST will be applicable on All Your orders. GST subsumes all other taxes like Excise duty, VAT, Entry tax etc.

How is the GST amount decided?

Following rules will govern whether or not additional GST will be applicable on the products purchased by you:

GST applicability:
- For a product
  - If the fulfilment is done on or after July 1st, 2017 and
  - If the order is placed before 15th November, 2019, and,
  - Total discount percentage is more than 19% of MRP,
  - Then GST may be collected from customers in addition to product price, post discounts. The discounts include those resulting from special offers such as Buy 1 Get 1 and similar offers.
- For a product, if the order is placed on or after 15th November 2019, the discounted price displayed on IndusVale platform shall be inclusive of all taxes, including GST.
GST amount: If applicable, the amount of GST collected from customer depends on category, for example
Apparel/Clothing: Max 12%
- On and from 15th November, 2019, the discounted prices displayed on the IndusVale platform shall be inclusive of all taxes.

If I return/cancel the purchased product will the GST/VAT amount charged be refunded?

Yes. If you return the product the applicable GST/VAT amount will also be refunded into the source account selected at the time of return initiation. However no refunds of GST/VAT shall be made in relation to platform handling fee collected from the consumer under IndusVale shipping policy.

Giving a gist of VQGAN-CLIP

Something intriguing…

Authors

Return and Refund Policy

Payments

Terms & Conditions

Shipping

IndusVale - for the seekers of India

Sign Up and Login

Privacy Policy for IndusVale