How can I solve the wrong shape in DataLoader?

Issue

I have a text dataset that I want to use for a GAN and it should turn to onehotencode and this is how I Creating a Custom Dataset for my files

class Dataset2(torch.utils.data.Dataset):
    def __init__(self, list_, labels):
        'Initialization'
        self.labels = labels
        self.list_IDs = list_

    def __len__(self):
        'Denotes the total number of samples'
        return len(self.list_IDs)

    def __getitem__(self, index):
        'Generates one sample of data'
        # Select sample
        mylist = self.list_IDs[index]

        # Load data and get label
        X = F.one_hot(mylist, num_classes=len(alphabet))
        y = self.labels[index]

        return X, y

It is working well and every time I call it, it works just fine but the problem is when I use DataLoader and try to use it, its shape is not the same as it just came out of the dataset, this is the shape that came out of the dataset

x , _ = dataset[1]
x.shape

torch.Size([1274, 22])

and this is the shape that came out dataloader

dataloader = DataLoader(dataset, batch_size=64, shuffle=True)

one = []
for epoch in range(epochs):
    for i, (real_data, _) in enumerate(dataloader):
        one.append(real_data)
one[3].shape

torch.Size([4, 1274, 22])

this 4 is number of samples in my data but it should not be there, how can I fix this problem?

Solution

You confirmed you only had four elements in your dataset. You have wrapped your dataset with a data loader with batch_size=64 which is greater than 4. This means the dataloader will only output a single batch containing 4 elements.

In turn, this means you only append a single element per epoch, and one[3].shape is a batch (the only batch of the data loader), shaped (4, 1274, 22).

Answered By – Ivan

Answer Checked By – Jay B. (AngularFixing Admin)

Leave a Reply

Your email address will not be published.